Article Text

Download PDFPDF

Physician Mental Workload Scale in China: Development and Psychometric Evaluation
  1. Chuntao Lu1,2,
  2. Yinhuan Hu1,
  3. Qiang Fu3,
  4. Samuel Governor3,
  5. Liuming Wang4,
  6. Chao Li2,
  7. Lu Deng1,
  8. Jinzhu Xie5
  1. 1 School of Medicine and Health Management, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
  2. 2 Jingmen NO.2 People's Hospital, Jingmen, China
  3. 3 Department of Epidemiology and Biostatistics, College for Public Health and Social Justice, Saint Louis University, Saint Louis, Missouri, USA
  4. 4 Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
  5. 5 The Third People's Hospital of Hubei Province, Wuhan, China
  1. Correspondence to Prof. Yinhuan Hu; hyh288{at}


Objective The purpose of our study is to develop a mental workload scale for physicians in China and assess the scale’s reliability and validity.

Design The instrument was developed over three phases involving 396 physicians from different tiers of comprehensive public hospitals in China. In the first phase, an initial item pool was developed through a systematic literature review. The second phase consisted of two rounds of Delphi expert consultations and a pilot survey. The third phase tested the reliability and validity of the instrument.

Setting Public hospitals in China.

Participants A total of 396 physicians from different tiers of comprehensive public hospitals in China participated in this study in 2018.

Primary and secondary outcome measures Cronbach’s α, content validity index, item-total score correlation coefficient, dimension-total score correlation coefficient and indices of confirmatory factor analysis.

Results Six dimensions (mental demands, physical demands, temporal demands, perceived risk, frustration level and performance) and 12 items were identified in the instrument. For reliability, Cronbach’s α for the whole scale was 0.81. For validity, the corrected item-content validity index of each item ranged from 0.85 to 1, item-total score correlation coefficients ranged from 0.31 to 0.75, and the correlation coefficients between the dimensions and total score ranged from 0.37 to 0.72. The results of the confirmatory factor analysis showed that the goodness-of-fit indices of the scale were satisfactory.

Conclusion The instrument showed good reliability and validity, and it is useful for diagnosing the mental workload of physicians.

  • physician
  • mental Health
  • workload
  • survey and questionnaires
  • hospitals

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

View Full Text

Statistics from

Strengths and limitations of this study

  • This is the first study to develop a measurement of physician mental workload from a subjective perspective in China.

  • Qualitative and quantitative methods were involved in item selection.

  • There was a potential reporting bias in the self-reported measurements of physician workload.

  • There was a selection bias due to all respondents voluntarily rather than randomly participating in the survey.

  • Among the six dimensions, perceived risk included only one item, which may have resulted in measurement error.


Internationally, there has been a focus on the relationship between physicians’ workload and their health.1 Physicians’ health is highly associated with their workload.2 Excessive workload impacts physicians’ health3 4 and increases the risk of work-related musculoskeletal disorders.5 6 High workload is related to adverse effects in the form of medical errors7 and adverse incidents.8 Physician workload can negatively contribute to patients’ perceived quality of care9 and affect patient satisfaction10 and safety.11 12 It is possible that these stressors have reached a point where they pose a severe problem for the entire healthcare system,13 as physicians’ unreasonable and overwhelming workload has adverse effects on physicians, patients and healthcare organisations.14

Workload is thought to be multidimensional and multifaceted.15 One aspect of workload includes the subjective psychological experiences of the human operator.16 Mental workload has emerged as one of the most critical occupational risk factors that results in burnout or anxiety.17 A lack of control over workload is expected to correlate closely with burnout.18 19 Heavy mental workload can lead to serious health problems (cardiovascular diseases, digestive problems and so on) for physicians17 and an inferior quality of care service.20 Currently, The European Pact for Mental Health and Welfare is conducting mental workload assessments to promote physical and mental well-being.21

Different tools have been proposed to assess mental workload. Previous research established a brief instrument with six items to measure physician mental workload.22 The NASA-Task Load Index (NASA-TLX) scale, which is widely used in measuring mental workload,23 has proven to be a sensitive, valid and reliable instrument24 and can be used in human factor research.25 Researcher has localised it as a 29-item questionnaire in Spain to measure workers’ mental workload.26 The existing body of research on NASA-TLX suggests that it can be used to measure nurse workload in healthcare settings.27–29 In the same vein, the Subjective Workload Assessment Technique (SWAT) is a subjective rating technique with three dimensions—time load, mental effort load and psychological stress load—and is used to assess mental workload.30 It has been successfully applied in assessing the mental workload of several aircraft multitasking conditions, such as in assessing the mental workload required by different systems of air defence.24 The Copenhagen Psychosocial Questionnaire is a widespread tool used in the industrial and service branches in Europe, and its main dimensions include the most influential psychosocial theories at work.31 Together, these tools provide essential insights into workload measurement in healthcare management, especially in nurse workload measurement. However, the workload of physicians is essentially different from the workload of the nurses and other workers that previous measurements were designed to assess. Thus, it remains unclear whether these tools can be directly used in measuring physician mental workload, and a mental workload measurement must be developed for physicians.

With increasing patient health demands, physicians tend to have a heavier workload, worse physical health, more mental strain and more intense relationships with patients in China.32 Data from several studies suggest that most physicians work more than 10 hours a day33 to manage outpatients and inpatients. On average, a physician in a tertiary hospital is responsible for 8.10 outpatients and 2.70 beds per day.32 Physicians have been abused, injured and, in extreme cases, murdered by patients or their relatives in hospitals across China,34 which results in psychological stress. Establishing a workload measurement system for medical personnel has been incorporated into the Chinese Patient Safety Goals by the Chinese Hospital Association.35

Existing studies on workload measurement instruments are concentrated on assessing objective workload in China, for example, measuring work time. While physicians’ mental workload is a critical problem, there are few instruments exploring this problem in China. The purpose of this paper is to develop a scientific mental workload instrument that can be used to assess the mental workload of physicians.


Study design

The instrument was developed in three phases. In the first phase, an initial item pool was developed by integrating previous studies through a systematic literature review. The second phase consisted of two rounds of Delphi expert consultations and a pilot survey in 2017. The third phase involved testing the psychometric properties of the instrument, including its reliability and validity, through a study conducted in 2018 in comprehensive public hospitals in China.

Framework and item generation and selection

We combined the dimensions of the NASA-TLX and SWAT frameworks to determine the item pool so that it would measure the current situation of Chinese physicans’ workload. Six dimensions and 15 items were sent to 20 experts (including physicians, hospital managers, researchers and human resource managers) for consultation. In accordance with the findings from two rounds of expert consultation, we deleted four items, added a new item (the intensity of physical activity) and revised the descriptions of all items. Then, there were six dimensions (physical demands, mental demands, temporal demands, perceived risk, frustration level and performance) and 12 items, which consisted of a prescale ranging from 0 to 100.

In the presurvey analysis, we selected three hospitals (one tertiary hospital, one secondary hospital and one first-tier hospital) through convenience sampling. A sample of 80 physicians was surveyed with a web-based scale during November and December 2017. Finally, a valid sample of 74 physicians was used for item selection. Items were refined based on the following indexes or methods: critical ratio, coefficient of variation, correlation analysis,36 Cronbach’s α37 and exploratory factor analysis (EFA).38

If an item was eliminated by any of the above methods, then the item was deleted or revised. The final scale consisted of six dimensions (mental demands, physical demands, temporal demands, perceived risk, frustration level and performance) and 12 items (table 1).

Table 1

Dimensions and items of physician mental workload scale

Data collection for testing the validity and reliability of the scale

To check the validity and reliability of the developed scale, we planned to survey 400 respondents (physicians working in hospitals) from different tiers of hospitals (two tertiary hospitals, two secondary hospitals and two first-tier hospitals). These hospitals were randomly selected from Hubei province, China. We used wenjuanxing (, a widely used website for conducting surveys in China, to develop an electronic questionnaire with which to survey physicians. Respondents could scan the access code or click on the website using their phones to access and complete the electronic questionnaire. We sent the access code and website to the human resource managers at each participating hospital, who then sent the access code to the physicians’ online communication group at each hospital. Three hundred and ninety-six physicians voluntarily participated in the survey before March 2018; 11 invalid samples were deleted.

The detailed scale instructions indicated that our scale was anonymous, that participation was voluntary and that our survey aimed to develop a physician mental workload scale, so the results would not be used for other purposes. The physician mental workload scale included three parts. The first part of the scale included 12 items that respondents scored one by one. The second part was a table that included 15 pairs of dimensions and was used to collect the weights of each dimension. Every two dimensions formed a pair (eg, mental demands vs physical demands, mental demands vs temporal demands and so on). Respondents chose which of the two dimensions in each of the 15 pairs contributed more to their workload. Then, the weight of each dimension was equal to the number of times that dimension was selected divided by 15. The third part of the scale was designed to collect physicians’ individual characteristics.

The response endpoints of the items are displayed in table 1. Items were scored as follows: 0, 10, 20, 30, 40, 50, 60, 70, 80, 90 and 100. The average scores of all items for a corresponding dimension were multiplied by the dimension weight to produce the dimension scores, and then, the total scores were calculated as the sum of all dimension scores.

Statistical analysis

Descriptive statistics are used to show the characteristics of the respondents, including gender, age and educational level (ie, PhD degree, master’s degree and undergraduate), job title (ie, senior, middle and junior), work years, hospital level (ie, tertiary hospitals, secondary hospitals and first-tier hospital), work hours per week, number of outpatients serviced per day and self-perceived health status.

For the reliability of the scale, Cronbach’s α was used to assess the internal consistency of each instrument component. Values of 0.70 or higher for Cronbach’s α were considered acceptable.37

The content validity index (CVI) of each item was calculated to assess the accuracy of the scale using scores of 1–4. Experts were invited to evaluate the items, with a score of 1 representing an item not relevant to the corresponding dimension and a score of 4 representing an item closely related to the corresponding dimension. The corrected item-content validity index (I-CVI) and average scale-content validity index (S-CVI/Ave) were calculated. A corrected I-CVI of 0.78 or higher and an S-CVI/Ave of 0.90 or greater were considered acceptable.39

The test of construct validity was performed using the correlation coefficient method, EFA and confirmatory factor analysis (CFA). Item–total score correlation coefficient, dimention-total score correlation coefficient and dimention-dimention correlation coefficient were used. Items with item-total score correlation coefficient below 0.40 should be revised or removed from the scale. The correlation coefficients among dimentions should be lower than the dimention-total score correlation coefficients. Bartlett’s test of sphericity scores lower than 0.05 and a Kaiser-Meyer-Olkin (KMO) score of sampling adequacy higher than 0.70 and close to 1 were considered appropriate for factor analysis.40 EFA and CFA were used to explore and confirm the structure of the scale. For the EFA, we used the varimax rotation method to examine whether the structure matched the framework. For the CFA, the criteria for the model fit indices were as follows: χ2/df <3; root mean square error of approximation (RMSEA) ≤0.05; root mean square residual (RMR) <0.05; goodness-of-fit index (GFI) >0.90; comparative fit index (CFI) >0.90; and Tucker-Lewis index (TLI) >0.90.41 Statistical analyses were performed with SPSS V.21 and AMOS V.17 (IBM Corp, Armonk, New York, USA).

Patient and public involvement

Our participants were physicians working in hospitals. They took part in the presurvey and formal survey to complete our scale. Participation was voluntary, and no incentives were provided for participation. Participants were not directly involved in the design or recruitment of this study. The results were not provided to participants.


Sample characteristics

Three hundred and ninety-six responses (online survey) were received, and 11 were excluded due to incomplete demographic information. There were no issues related to floor or ceiling effects as the questions for every item were responded to in the form of a web-based survey. The characteristics of the participants are presented in table 2.

Table 2

Respondents’ characteristics

Reliability of physician mental workload scale

Each of the six components demonstrated at least satisfactory internal consistency (higher than 0.70), with Cronbach’s α in the range of 0.70–0.90. The Cronbach’s α for the whole scale reached as high as 0.81, which indicated that the scale had excellent reliability.

Validity of physician mental workload scale

The corrected I-CVI of each item ranged from 0.85 to 1 (table 3), which was higher than 0.78. The S-CVI/Ave was 0.96, which was higher than 0.90. All of these values supported the good content validity of the scale.

Table 3

Content validity and correlation coefficient of item-total scores of the scale

The correlation matrix between items and total scorewas inspected to confirm the convergent validity, which was indicated by reasonable coefficients of 0.40 and above, except for F1 and F2 (table 3). The calculated correlation coefficients between dimensions and the total score had a range of 0.37–0.72, showing that the dimensions and total scores had good convergent validity as well. Additionally, the correlation coefficients among the dimensions were lower than the correlation coefficients between the dimensions and the total score, which indicated that the scale had good discriminant validity (table 4).

Table 4

Correlation coefficient matrix between dimensions and total scores of the scale

Exploratory factor analysis of physician mental workload scale

The KMO sample adequacy measurement was 0.81, which was higher than the recommended value of 0.70. Bartlett’s test of sphericity value with the χ2 values was 1950.70 (p<0.000). Thus, the data were suitable for factor analysis. Considering the experts’ suggestions, we selected six principal components in the EFA, and the results showed that the six-dimensional model explained 81.88% of the total variance (table 5).

Table 5

Factor loadings for the rotated component matrix: varimax rotated components

Component 1, ‘mental demands’, was developed from three items that asked about feeling or memory requirements, emotional requirements and the effort required to overcome difficulties, with a factor loading in the range of 0.74–0.81. Component 2, ‘frustration level’, consisted of two items that asked about anxiety and levels of depression or frustration, and the factor loading was in the range of 0.86–0.88. Component 3, ‘physical demands’, consisted of two items related to strength requirements and the intensity of physical activity, with a factor loading in the range of 0.84–0.90.

Component 4, ‘temporal demands’, constituted two items that asked about the ratio of required time to available time and the frequency of multitasking, with a factor loading in the range of 0.77–0.82. There were two items in ‘performance’ component 5, which related to the sense of achievement and job satisfaction regarding work outcomes, with the factor loading in the range of 0.85–0.90. Component 6, ‘perceived risk’, included only one item that explained the perception of risk in conducting tasks (such as medical disputes and risk of being infectious), with a factor loading of 0.84.

CFA of physician workload scale

The six-factor model obtained after EFA was tested by CFA using the maximum likelihood estimation method. The goodness-of-fit model was as follows: χ2/df=1.84 (<3), RMR=0.04 (<0.05), GFI=0.97 (>0.9), CFI=0.98 (>0.9), TLI=0.97 (>0.9), and RMSEA=0.05 (≤0.05). Based on these criteria, the model was a good fit for the data.


The purpose of this study was to develop a mental workload scale for physicians and explore its validity and reliability. The test results show that the scale is reliable and valid; hence, it is considered an effective instrument for assessing physician mental workload in Chinese comprehensive public hospitals. The results show a six-dimensional model that includes aspects related to mental demands, physical demands, temporal demands, perceived risk, frustration level and performance. In contrast to other relevant scales, this scale includes only 12 items; thus, its length is a strength because it can be completed in a short time. As for the scale’s contents, the dimensions of perceived risk and temporal demands are uniquely distinctive for physician mental workload in China.

The Cronbach’s α of the whole scale was higher than 0.7, which indicated that the scale had excellent reliability. Additionally, the corrected I-CVI was higher than 0.78, and the S-CVI/Ave was more than 0.9, which showed that it had good content validity. For the construct validity, except for F1 and F2, the correlation coefficient between the item and total scores was more than 0.4, which showed that the construct validity was good. The item-total scores of the two items in the dimension of performance were near 0.4 and perhaps would have been relevant with reverse scoring. Consistent with previous research on NASA-TLX, the performance dimension shows limited practical relevance since variations influence it in terms of physical load.42 Another study reported that subjective assessments of mental workload might not provide an accurate estimation of the performance dimension.26 Considering this information, we retained the two items but revised their description.

The specific dimension perceived risk, which is not included in the NASA-TLX or SWAT frameworks, is highly associated with physician mental workload in China. There tends to be an estranged relationship between physicians and patients, which puts physicians at a dangerous risk of being assaulted by patients or visitors.43 According to statistics, 96% of medical staff were abused or injured in 2012.44 The physician–patient relationship is becoming increasingly fragile and has reached an unprecedented poor level in China.45 This tense relationship results in heavy psychological workload during physicians’ work.

Another dimension, temporal demands, is also highly specific. The gap between healthcare demand and supply (and thus the doctor–patient ratio) in China has caused physicians in secondary and tertiary hospital settings to become overworked.46 They frequently need to work overtime and perform more than one task at the same time. According to a report by the Chinese Medical Association in 2018, physicians in tertiary hospitals had an average workweek of 51.05 hours, which was more than the legal 40 hours per week.47 Research has reported that physicians may feel stressed when poor scheduling leaves them pressed for time.48 Mental workload encompasses the subjective experience of a given task load.49 High task demands require considerable time and mental effort and represent a heavy workload for physicians.50 The worse physicians’ experience of their task load, the higher their mental workload is.51

Although we have attempted an accurate examination of the measurement properties of the physician mental workload scale by using qualitative and quantitative methods, there are still some limitations that merit discussion. First, among the six dimensions, perceived risk included only one item, which may have resulted in measurement error. Second, there was a potential reporting bias in the self-reported measurements of workload among physicians. Third, all respondents voluntarily decided to take part in the survey. Physicians who were overburdened at the time of the study may not have had time to take part in the investigation, which could have resulted in selection bias. Thus, these findings reveal the need for continued research to improve this scale. Meanwhile, burnout is also relevant to mental workload and is another direction for further exploration.


Creating new items from a subjective perspective is of paramount importance in investigating Chinese physicians’ workload. The physician mental workload scale has acceptable preliminary psychometric properties, with 6 dimensions and 12 items. The use of this scale can help us identify the main stressors in physician mental workload and implement targeted optimisation strategies to mitigate these stressors in order to enhance the physical and mental health of physicians. Doing so will consequently improve the quality and efficiency of healthcare delivery in hospital settings.


We would like to thank all of the participants involved in this research for their time and contributions.


View Abstract


  • Contributors YH and QF designed the study; ChuL and JX performed formal analysis; YH obtained funding; JX and LD took part in the investigation; LW, ChuL and LD were involved in data cleaning; ChuL wrote the original draft; SG, QF and ChaL contributed to the interpretation of the results; and ChaL and ChuL performed critical revisions of the manuscript; all authors have read and approved the final manuscript.

  • Funding This work was supported by the National Natural Science Foundation of China (grant number 71774062).

  • Disclaimer The funder had no role in the study design, data collection, analysis, decision to publish the manuscript or manuscript preparation.

  • Competing interests None declared.

  • Patient consent for publication Not required.

  • Ethics approval The Ethics Committee of Tongji Medical College, Huazhong University of Science and Technology (IORG No. IORG0003571), gave the final approval for the study.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement Data are available on reasonable request.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.