Article Text

## Abstract

**Objective** Some researchers have reported that distribution of total depressive symptom scores in the general population may follow an exponential pattern except at the lowest end of the scores. To understand the mechanism responsible for this phenomenon, we investigated the mathematical patterns of the individual distributions for each item of a depressive symptom scale.

**Methods** We analysed data from 32 022 participants in the general population who participated in the Active Survey of Health and Welfare, Japan. Depressive symptoms were assessed using the Japanese version of Center for Epidemiologic Studies Depression Scale (CES-D). CES-D has 20 items, each of which is scored in 4 grades: ‘Rarely’, ‘Some’, ‘Much’ and ‘Most of the time’.

**Results** The individual distributions of 16 negative items belonging to the depressive mood, somatic symptoms and retarded activities, and interpersonal relations categories, followed a common mathematical pattern, which displayed different distributions with a boundary at ‘Some’. The distributions for the 16 items between ‘Rarely’ and ‘Some’ appeared to cross at a single point. On the other hand, the distributions of the 16 items between ‘Some’ and ‘Most’ followed a linear pattern when plotted using a log-normal scale. The remaining 4 items in the positive affect subscale showed non-specific patterns.

**Conclusions** The common mathematical pattern of the 16 negative item distributions may contribute to the exponential pattern of the distribution of total depressive symptom scores except at the lowest end of the scores.

- STATISTICS & RESEARCH METHODS
- EPIDEMIOLOGY

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/

## Statistics from Altmetric.com

### Strengths and limitations of this study

To the best of our knowledge, this is the first mathematical model analysis of the individual distributions for each item response of a self-reported depression scale.

A large representative sample of the Japanese general population was used.

We found the distributions of the 16 negative items exhibited a common mathematical distribution.

Analysis based on other mathematical models was not performed.

## Introduction

Depression is a common mental disorder that is among the leading causes of disability worldwide.1 Given that depressive symptoms are closely linked with depression, there has been great interest in understanding the distribution of depressive symptoms in the general population.2–4

Recently, we reported that the right tail of the distribution of total depressive symptom scores based on the Center for Epidemiologic Studies Depression Scales (CES-D) followed an exponential curve in the general population.5 In accordance with our results, Melzer *et al*6 reported that an exponential curve provided the best fit for total neurotic symptoms and depressive scores from the British National Survey of Psychiatric Morbidity. Thus far, the potential mechanisms contributing to the specific pattern of the total depressive symptoms distribution have not been examined.

Of note, both Melzer *et al* and our group have pointed out that total depressive symptom scores follow an exponential curve over a specific level of depressive symptom scores. Melzer *et al* reported that all neurotic symptoms and depressive scores using the Revised Clinical Interview Schedule (CIS-R) follow an exponential curve for symptom counts of more than 3. We reported that total depressive symptom scores follow an exponential curve in the range of CES-D scores over 11 points. These results indicate that the distributions of total depressive symptom scores differ according to the level of the total depressive symptom scores. The key to understanding the exponential curve of the total depressive symptom scores may lie in the different patterns of the distribution according to the levels of depressive symptom scores.

Depressive symptom scales consist of each item of the questionnaires. It has been hypothesised that the distributions of the responses for each item contribute to the different patterns of total depressive symptom score distributions. However, to our knowledge, little is known about the mathematical pattern of each item response in depressive symptom scales.

The CES-D is a 20-item self-reported scale that has been shown to measure depressive symptoms across four domains: depressed affects, somatic symptoms, positive affects and interpersonal difficulties.7 These four factors have been replicated in several populations and confirmed using meta-analytic methods.8–10

The present study examined the distributions of 20 depressive symptom items according to four factors using more than 30 000 CES-D assessments. The goal of the present study was to delineate the distribution patterns for each depressive symptom item and to examine whether the distributions of responses to each item contributed to the different patterns of the distributions according to the level of total depressive symptom scores.

## Methods

### Participants

We used data from the Active Survey of Health and Welfare (ASHW) conducted by the Japanese Ministry of Health, Labor and Welfare, in 2000. The ASHW is an annual nationwide survey conducted to obtain data required for policy making by the Japanese Government. In 2000, the ASHW examined depressive symptoms among a representative sample of the Japanese general population. To ensure that the sample was representative of the general population, survey participants were selected from among individuals aged 12 years and over living in 300 communities across Japan. These communities were selected from 881 851 precincts identified in the 1995 census using a stratified sampling design. Informed consent was obtained from all the participants. The data and methods used by the survey have been publicised in detail.11

The questionnaire was returned by 32 729 respondents. The response rate was not published by the Ministry of Health, Labor and Welfare and Health. However, the response rates for similar surveys conducted 3 and 4 years earlier were 87.1% and 89.6%.12 We assumed that the response rate for the present study was over 80%. A total of 707 participants who returned a blank questionnaire were excluded from the analysis. The final sample size was 32 022.

### Measures

Depressive symptoms were assessed using the Japanese version of the CES-D. This 20-item scale assesses the frequency of a variety of depressive symptoms within the previous week (0=rarely or none of the time, 1=some of the time, 2=much of the time and 3=most or all of the time). Higher values indicate greater psychological distress. The 20 items of the CES-D were grouped into the following four subscales: depressive mood (items 3, 6, 9, 10, 14, 17 and 18), somatic and retarded activities symptoms (items 1, 2, 5, 7, 11, 13 and 20), interpersonal relations (items 15 and 19) and positive affects (items 4, 8, 12 and 16). The positive affect items were reverse scored (as shown in table 2).

### Analysis procedure

The item response percentages were delineated according to the depressive mood subscale, the somatic and retarded activities subscale, the interpersonal subscale and the positive affect subscale. The response rate for each item varied from 81.3% to 89.0%. Participants who did not respond to each item were excluded from the percentage analysis. The item response percentages for the distributions of the 20 items were analysed using a normal scale and a log-normal scale.

We used the JMP V.11 for Windows (SAS Institute, Inc, Cary, North Carolina, USA) to calculate the descriptive statistics and the frequency distribution curves.

## Results

The demographic characteristics of the participants are shown in table 1. Participants who did not respond to items regarding sex or age (n=222) were also included in the analysis.

Table 2 shows the response rates for the 20 items in the CES-D questionnaire. The items in the depressive mood group, the somatic symptoms and retarded activities group, and the interpersonal relations group, exhibited a common pattern, with a highest response frequency for ‘Rarely’ and a decreasing response frequency as the item score increased, with the lowest response frequency observed for ‘Most’. No exceptions to this pattern were seen. On the other hand, the four items in the positive affect subscale did not exhibit a similar pattern.

As depicted in figure 1, the depressive mood subscale (figure 1A), the somatic symptoms and retarded activities subscale (figure 1B), and the interpersonal relations subscale (figure 1C), exhibited right-skewed distributions, whereas the four items in the positive affect subscale (figure 1D) showed a plateau-shaped distribution, suggesting that the distribution pattern of the positive affect subscale differed from that of the other groups.

To ascertain the distribution pattern of each response for the 16 items in the depressive mood subscale, the somatic symptoms and retarded activities subscale, and the interpersonal relations subscale, the responses were plotted together. The distributions for each of the 16 items showed a common pattern, which displays different distributions with a boundary at ‘Some’ (figure 2A). The word ‘boundary’ refers to that which divides one mathematical pattern from another. The lines for the 16 items crossed with each other between ‘Rarely’ and ‘Some’, whereas the lines rarely crossed between ‘Some’ and ‘Most’. As indicated by the arrow, the lines for all 16 negative items between ‘Rarely’ and ‘Some’ appeared to cross at a single point. The distributions of the 16 items from ‘Some’ to ‘Most’ showed a characteristic pattern.

With a log-normal scale, the distributions for the 16 items showed a linear pattern for the ‘Some’ to ‘Most’ responses (figure 2B), suggesting that these 16 items exhibited an exponential pattern for this response level. In addition, the lines for the 16 items were almost parallel to each other, suggesting that the gradients of the linear patterns for the 16 items were similar to each other.

## Discussion

The aim of the present study was to delineate the distributions of the responses to each depressive symptoms item and to examine whether the distributions of these item responses contributed to the different distribution patterns observed according to the level of total depressive symptom scores.

The main finding in the present study is that the distributions of the 16 items belonging to the depressive mood subscale, the somatic symptoms and retarded activities subscale, and the interpersonal subscale, follow a common mathematical model, which displayed different patterns with a boundary at ‘Some’. The lines for the 16 items appeared to cross at a single point between ‘Rarely’ and ‘Some’, whereas the distributions of the 16 items between ‘Some’ and ‘Most’ followed an exponential pattern with the same parameter.

To confirm the reproducibility of these findings, we examined other published studies with various populations13–16 and confirmed this phenomenon in other studies, although the degree of fit to the mathematical model varied according to the studies (data not shown). These results, together with those of the present studies, suggest that the 16 negative items could follow the same distribution model in various populations.

The mechanism responsible for the exponential pattern of the total CES-D score except at the lowest end of the symptom score could be hypothesised based on the aforementioned findings as follows. Since the distributions of the positive affects items exhibited plateau-shaped patterns, the percentage of the positive affects score relative to the total depressive symptoms score would decrease with an increasing total CES-D score. Thus, the influence of the 16 negative items on the total CES-D score increases as the total CES-D score increases. Since the 16 negative items exhibited exponential patterns between the ‘Some’ and ‘Most’ response levels, the total CES-D score may follow an exponential curve over this range of specific CES-D scores. In addition, the similar exponential parameters of the 16 negative items might also contribute to the distribution of total CES-D score. Since the total CES-D score consists of each item of the questionnaires, and the responses to each item are inter-related to each other, it has been hypothesised that the inter-relations of the responses for each item contribute to the distribution of total CES-D score. However, the inter-relations of the responses for each item are complicated.8–10 Further evaluations using numerical expressions or simulation analyses are needed.

### Mathematical model and crossing at a single point between ‘Rarely’ and ‘Some’

The mathematical model that the 16 negative items follow is shown in figure 3A. If the probability of ‘Some’ is represented as P_{1} and the equal ratio among ‘Some’, ‘Much’ and ‘Most’ is represented as r. The probabilities of ‘Some’, ‘Much’, ‘Most’ and ‘Rarely’ are expressed as P_{1}, P_{1}r, P_{1}r^{2} and 1−P_{1}×(r^{2}+r+1). The scores of ‘Rarely’, ‘Some’, ‘Much’ and ‘Most’ are expressed as 0, 1, 2 and 3, respectively.

In the present study, the lines for the 16 negative items appeared to cross together at a single point between the ‘Rarely’ and ‘Some’ response levels (figure 2A). This finding can be explained by the mathematical model. As shown in figure 3A, the line between ‘Rarely’ and ‘Some’ is expressed as:where a_{1} is the gradient and b_{1} the intercept of the line.

Then a_{1} and b_{1} can be expressed as follows:and

As shown in figure 3B, the two lines between ‘Rarely’ and ‘Some’ are expressed as follows:andwhere,and

The point where the two lines intersect can then be obtained as follows:and

The intersection point is expressed by r only. Therefore, regardless of the slope and intercept of the line, all the lines that follow this mathematical model must cross at a single point between ‘Rarely’ and ‘Some’.

Recently, an item-response theory, developed in the field of education, has been used to evaluate the depressive symptom scales.17 ,18 In general, item-response theory assumes that a normally distributed latent trait underlies performance in a measure. However, to the best of our knowledge, no evidence for a normally distributed latent trait of depressive symptoms has been reported. Our results suggest the possibility that the distribution of latent traits of depressive symptoms could be exponential.

The reason why the distributions of the 16 negative items between ‘Some’ and ‘Most’ followed an exponential pattern is not clear. In general, exponential distribution is observed where individual variability and total stability are organised together, such as the Boltzmann-Gibbs law and income distribution.19 The Boltzmann formula shows the relationship between entropy and the number of ways the molecules of a thermodynamic system can be arranged. From the view point of the Boltzmann formula, Dragulescu and Yakovenko19 suggested that any conserved quantity in a big statistical system has an exponential probability distribution in equilibrium. With respect to individual variability and total stability, the conditions that enable exponential distribution could be present in the distributions of the 16 negative items. Further studies are needed to clarify the mechanism.

### Different patterns with a boundary at ‘Some’

The reason why the distributions of the 16 negative items display different patterns with a boundary at ‘Some’ is unknown, but the conditions that enable such a distribution can be speculated on. In fact, psychological phenomena are evaluated using an ordinal scale.20 Unlike an interval scale, an ordinal scale is not necessarily equally spaced. If the range for ‘Rarely’ differs from that of ‘Some’, ‘Much’ and ‘Most’, the distributions of the 16 negative items would display different patterns with a boundary at ‘Some’.

In general, each item of CES-D is rated in two stages. First, the population is divided according the presence or absence of the symptom. If the symptom of each item is absent, it is regarded as ‘Rarely’. Next, if the symptom of each item is present, the duration of the symptom is quantified and divided into ‘Some’, ‘Much’ and ‘Most’. This two-step process increases the possibility that ‘Rarely’ will cover the participants without the symptom, while each of ‘Some’, ‘Much’ and ‘Most’ will cover about one-third of the range with the symptom. In other words, ‘Rarely’ is scored using a purely ordinal scale, whereas ‘Some’, ‘Much’ and ‘Most’ are scored using an ordinal scale that is close to an interval scale. Further consideration regarding this speculation is needed.

### Distributions of the positive affect items

Although the distributions of the four positive affects showed a plateau-shaped distribution in this study, the evidence for the distributions of positive affect symptoms has been mixed. To the best of our knowledge, no common response patterns to positive affect symptom items have been reported so far. A number of cross-cultural comparison studies have reported that the response patterns for the positive affects items vary according to ethnicity or nation (eg, skewed, plateau-shaped, U-shaped and reverse U-shaped), whereas the response patterns for the 16 negative items were generally comparable.21 ,22 Recently, because of their relative independence, positive affect and negative affect have been commonly recognised as two different phenomena that should be studied individually.23 ,24 Our results lend credibility to the view that positive affect and negative affect are two different phenomena. Although the CES-D score is the composite score of the 20-item scores, it could be appropriate to recognise the 16 negative item scores and the positive scores as different scores.

### Strengths and limitations

There are some methodological advantages in the present investigation. First, the sample was representative of the general Japanese population. Survey participants were selected among individuals living in 300 communities, which were selected from 881 851 precincts identified in the 1995 census using a stratified sampling design. The use of a representative sample of the Japanese general population reduced the selection bias of the data. Second, the relatively large sample size (N=32 022) increased the ability to elucidate the patterns of the distributions of depressive symptom items.

This study has some limitations. First, a standard psychiatric diagnosis with structured interview was not performed for the participants in this study. Therefore, the study did not encompass a psychiatric diagnosis of depressive symptoms. Second, although the 16 negative symptom items exhibited a linear pattern with a log-normal scale, analysis based on other mathematical models was not performed. In general, the most important part of model evaluation is testing whether the model better fits empirical data than other models. However, to the best of our knowledge, no other mathematical models for the distributions of depressive symptoms have been reported so far. Therefore, it is difficult to test whether the estimated data of the present model fit empirical data better than those of other models. Thus, using graphical analysis, we propose a simple model that consists of only two parameters as a baseline for improvement. In general, there is a trade-off between the goodness of fit and complexity of the model.

Despite these limitations, the present study provides important information regarding the theory of a self-report depression scale. The scores of a self-report depression scale can be interpreted in a norm-referenced manner. A norm-referenced score interpretation compares individual’s scores on the test with the statistical representation of a population. One representation of statistical norms is the normal distribution, which is adopted as the distribution model of intelligence. The statistical norm is useful to evaluate the scores of the individuals in a population and to verify the result of the survey. The present model could provide a statistical norm for a self-report depression scale.

The degree to which these results can be generalised to other scales for depressive symptoms is unclear, but warrants examination. While the item scale of CES-D assesses the frequency of a variety of depressive symptoms within the previous week, others (eg, CIS-R) are composites of frequency, salience and severity. Given that depressive symptoms, such as depressed mood, anxiety and insomnia, affect people worldwide, an overarching mathematical model explaining the distributions of depressive symptom scores in a general population could be useful.

## Acknowledgments

The authors would like to thank the Active Survey of Health and Welfare project for providing the data for this study, and Dr Shinji Sakamoto for the helpful advice.

## References

## Footnotes

Contributors All the authors participated in the study, and read and approved the final manuscript. ST conceived and designed the experiments, and performed the experiments. ST, YK and TF were involved in interpretation of data for the work and wrote the paper.

Funding This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.

Competing interests None declared.

Patient consent Obtained.

Ethics approval Our present study was approved in 2014 by the ethics committee of Panasonic Health Center, approval number 2014–1. The methods were carried out in accordance with the approved guidelines and regulations.

Provenance and peer review Not commissioned; externally peer reviewed.

Data sharing statement No additional data are available.

## Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.