Background

It has been well documented that women live longer than men at all ages with the female–male difference in life expectancy at birth varying from 4.4 in Sweden to 13.2 years in the Russian Federation [1]. Female advantage in life expectancy grew rapidly within the first three quarters of the twentieth century in most Western countries [2, 3]. The sex gap narrowed in most EU countries and the US within the last two decades of the twentieth century, although the timing of this reversal in differential trend by gender varied across the countries [1, 3, 4]. Japan was the only population among the G7 countries with a persistently widening sex differential in life expectancy from 1950 to 1999 [5].

While in 2006 women outlived men even in the poorest countries of the world [6] the female advantage in healthy life expectancy is less pronounced [7]. Women tend to rate their general health lower in most EU countries, [8] have more difficulties in performing activities of daily living (ADL) [9, 10], have inferior performance on various physical tests [1113], and are more likely to report a chronic condition than men [14]. However, literature suggests that sex differences in health depend on the selected dimension of health and age group studied. Men report more life-threatening conditions, such as heart insufficiency, angina pectoris or thrombosis in the leg, while women suffer more frequently from non-life-threatening but disabling conditions, e.g. migraine, musculoskeletal and autoimmune diseases [1517]. Although men have a higher incidence of coronary heart disease (CHD) in middle age, the sex gap in the CHD incidence narrows after 60 and becomes very small after 80 years [1820].

Cross-national comparison studies have been focused primarily on sex differentials in life expectancy and mortality; less research has examined sex differences in various health domains across countries. The difficulties of cross-national comparisons include the lack of comparable data and cross-cultural differences in reference levels used in answering questions about health, response categories, and in health perceptions [21]. However, such factors are less likely to bias comparisons of within-country male–female differences or ratios in measured health. Besides, cross-national comparison studies increase our ability to generalize research findings about sex differentials in health. The present study aims to describe sex differences in mortality and several major health dimensions across countries representing populations in Europe, Asia, and the North America. To our knowledge, no prior study has compared sex differences in health among individuals aged 45 to 90+ living in Denmark, Japan, and the USA, countries with diverse cultures, health systems and phases of demographic and epidemiologic transition.

Methods

Study populations

The study is a secondary analysis of population based studies conducted in Denmark, Japan, and the US with details described elsewhere [2227].

In brief, the Danish data come from the Study of Middle-Aged Danish Twins (MADT), the Longitudinal Study of Aging Danish Twins (LSADT), and the Danish 1905-Cohort Study. Eligible participants were identified through the Danish Twin Register [28] and Danish Civil Registration System [29]. The MADT represented a random sample of 120 twin pairs from each birth cohort from 1931 to 1952, aged 45–68 years in 1998 when the survey was implemented. The LSADT involved Danish twins aged 75+ and residing in Denmark by January 1995 when the baseline survey was carried out. The waves in 1997, 1999, and 2001 included both follow-up of participants from previous waves and new participants aged at least 70 years. Previous research in Denmark demonstrated that twins are representative of the general population in terms of health trajectories and all-cause or cardiovascular mortality and, thus, are good population models for epidemiological and demographic research [30, 31].

The participants of the 1905-Cohort Study included all Danes born in 1905 and residing in Denmark in 1998, when the baseline survey was initiated (n = 2,262). The consecutive waves in 2000, 2003 and 2005 were follow-up assessments of survivors from previous waves. In all surveys the individuals residing in nursing homes or sheltered accommodation were considered eligible to participate in the study. If persons refused or were unable to participate in the face-to-face interview, a proxy respondent, usually a close relative, was sought. In total, 4,314 middle-aged twins, 4,731 old-aged twins, and 2,262 oldest-old persons completed the intake assessment in the MADT, LSADT, and the Danish 1905-Cohort Study, respectively, and were included in the present analyses (Table 1). The data for 13 twins participated in both the LSADT and the Danish 1905-Cohort Study were taken from the LSADT.

Table 1 Sample size in the Danish surveys, the Health and Retirements Study, and the Nihon University Japanese Longitudinal Study of Aging

The initial response rates were 83.1% in the MADT, 77% in the LSADT, and 62.8% in the Danish 1905-Cohort Study [2224]. All three Danish studies are comparable with regard to the design, implementation and data collection instruments with only minor differences, primarily reflecting the different age distributions in the three surveys. Data collection in each survey wave was carried out within approximately 3 months and at participants’ homes.

The US data are from the Health and Retirement Study (HRS), a biennial survey representative of the US population 50 years of age and over [25]. The HRS began in 1992 and the sample has been resurveyed approximately every 2 years since then. Younger cohorts have been added subsequently; a sample representing those born in 1942–47 was added in 2004 so that at that date the survey represented people born in 1947 and before, along with their spouses. Most of the HRS data utilized in the present study were collected in 2006 when the survey is representative of those 52 years of age and over and their spouses (Table 1). The institutional population is not initially included in the HRS sample at the time of the first interview but the longitudinal sample is followed into institutions so that the institutional population is almost fully included in the current data. Sample sizes vary because only about half the sample received the physical performance measures, e.g. grip strength (N = 7,176). The whole sample provided information on self-reported health (N = 17,852) and self-reported depressive symptoms (N = 16,539) and performed tests of immediate word recall (N = 16,481). The MMSE was included in what is known as the “ADAMS” subsample of the main survey, undertaken with a much smaller sample selected from the main survey in 2000 and 2003 (N = 814) [32]. The HRS is collected using in-person and phone interviews, although mail-back interviews have been used where requested. The household response rate in the HRS in 2006 was 88%.

The Nihon University Japanese Longitudinal Study of Aging (NUJLSOA) is a longitudinal survey of a nationally representative sample of the population aged 65 and over in Japan initiated in 1999 (n = 4,997) [33, 34]. The overall response rate at the baseline survey was 74.6%. The follow-up surveys were conducted in 2001, 2003, and 2006. The sample was refreshed in 2001 and 2003 for those aged 65 and 66. The NUJLSOA data utilized in the present study were from 2006 follow-up survey which is representative of those 68 years of age and over (Table 1). The NUJLSOA sample at the baseline included community-dwellers only, but the follow surveys included also survivors followed into the institutions and, thus, cover institutionalized population. The NUJLSOA survey was designed to be comparable to the US Longitudinal Study of Aging and the AHEAD sample of the HRS.

Total number of participants in the Danish studies, the HRS and the NUJLSOA are presented in Table 1 by gender and age groups. Sample sizes for each health measurement may vary due to missing values: 0.1 (for physical disabilities)—10.9% (for MMSE) in Danish surveys, 0.4% (for self-rated health)—8.1% (for immediate recall test) in the HRS, and 0.85% (for physical disabilities)—27.6% (for grip strength) in the NUJLSOA.

Sex- and age-specific mortality

The Human Mortality Database (HMD) was used to examine sex differences in all-cause death rates in the three countries by 5-year age group and sex in 1998 [35]. The year 1998 was selected as it is the closest year to the time when the selected studies were initiated or expanded to include additional cohorts.

Health measures

In Denmark grip strength in kilograms was measured by using a Smedley dynamometer (TTM; Tokyo, Japan) [11]. In the LSADT grip strength was available starting from the second follow-up survey in 1999 and was measured as the maximum of 6 attempts, 3 attempts on each hand, whereas in the 1905-Cohort Study it was the maximum of 3 attempts. In the 2006 HRS, handgrip strength was measured twice on each hand using a Smedley spring-type hand dynamometer. Two measurements were taken for each hand and the score is the mean of the two tries for the dominant hand. If both hands were equally dominant, the right hand was used [36]. In the NUJLSOA, grip strength was measured using a TANITA Hand Grip Meter Blue 6103 in a standing position with the arm straight down at the side.

In the Danish surveys physical disability score was computed as the number of 6 basic ADL with which the person had difficulty. This included ability to dress and wash oneself, walk around the house, stand up from chair or bed, eat, and go to the toilet with 3 response options: yes, yes with aids/personal help, no. Persons reporting ‘no’ were coded as having difficulty. The basic ADL items were administered in the 1905-Cohort Study and LSADT 1995 and 1997 waves only. In the HRS, the physical disability score was calculated as the number of six basic ADL items with which an individual had difficulty because of a health or memory problem: dressing, walking, bathing, eating, transferring and toileting. There were 4 response options: yes, no, cannot do, and do not do. Individuals reporting ‘yes’ or ‘cannot do’ were coded as having difficulty. The same six items were used to construct ADL scores in the NUJLSOA. The question wording was similar to that in the HRS, except that “memory problem” was not mentioned as a cause of difficulty. The response categories were: yes, do not do, and no. The individuals who responded ‘yes’ and who responded ‘do not do’ because of a ‘health or physical state’ were coded as having difficulty. The analysis of sex differences in ADL included also proxy respondents in all three countries.

Cognitive function was assessed using indicators of immediate word recall in the three studies and scores on the Mini-Mental State Exam (MMSE) in Denmark and the US. Respondents were asked to recall immediately a list of 12 nouns in the Danish surveys and 10 nouns in the HRS and NUJLSOA. For our analysis, the individual score was computed as the percentage of correctly recalled words. The Mini-Mental State Exam (MMSE) is a 19-item standard cognitive screen that was conducted in the HRS and all Danish surveys except MADT. Scores range from 0 to 30 with higher score reflecting better cognitive ability [37].

The global self-rated health (SRH) question asked interviewees to evaluate their health in general (How do you consider your health in general?) with 5 possible responses in all surveys. In the Danish surveys the responses were: excellent, good, acceptable, poor, and very poor and in the HRS: excellent, very good, good, fair, and poor. The same question with slightly different response categories was asked in the NUJLSOA: very healthy, healthier than average, of average health, somewhat unhealthy, very unhealthy. In all three studies the original mean scores were linearly transformed, so that the higher scores indicate better ratings of health in all three surveys.

In the Danish study populations depression was assessed using the 21 depression items from the Cambridge Mental Disorders of the Elderly Examination (CAMDEX) reflecting participants’ current state and supplemented by additional 11 items for more comprehensive assessment of depression history and affective state [37]. Based on factor analysis of the individual items the total depression score was computed from 17 of the 21 depression items included in the surveys [38]. Several versions of depression scales have been used in the HRS. The present analysis used a Center for Epidemiologic Studies Depression (CES-D) score based on 8 items with responses of ‘yes’ or ‘no’. In the NUJLSOA 11 of the original 20 CES-D items were administered with 3 response categories [39]. For all depression scales higher scores reflected greater levels of depression.

Results

All-cause sex- and age-specific mortality rates

Figure 1 illustrates age-specific all-cause mortality rates, sex ratio and absolute difference in age-specific mortality rates in 1998. The age-specific death rates among women and men were the lowest in Japan and, generally, the highest in Denmark in 1998. Only Danish men aged 50–54 and 55–59 had lower all-cause mortality rates than the same-aged men in the US.

Fig. 1
figure 1

Age- and sex-specific death rates, sex ratios, and absolute difference in 1998

The age-specific sex ratios of mortality rates (male/female ratio of age-standardized mortality rates) indicate that women had consistently lower death rates at all ages in all three countries (Fig. 1). The sex ratio was highest in Japan in almost all age groups, except the 100+ age group. In Denmark, the sex ratio was lower than that in the US at middle and young-old ages, whereas from 75 years the opposite pattern was observed (Fig. 1).

The trajectories of the sex ratio with age were country-specific. In Denmark, the sex ratio varied in the younger part of the age range, 45–49 to 75–79 years, and it declined steadily at the older ages. In the US, the sex ratio was highest among persons 45–49 and 50–54 years old and declined gradually afterwards. In Japan, the sex gap increased from age 45–49 to 65–69 and then rapidly declined at the older ages.

The male–female absolute difference in the age-specific death rates increased steeply with advancing age until age 90 in the three countries, but afterwards its trajectories were country-specific. In Denmark, the absolute difference continued to increase until the oldest age group when it was the largest. In the US the absolute difference increased until the 90–94 years and remained at the same level in the oldest age groups. In Japan, however, the absolute difference in the age-specific death rates increased until the 95–99 age group, but it decreased substantially among 100+ years old individuals.

Grip strength

In total 8,516 Danes (57.4% women) had handgrip strength measurement at intake, 7,063 persons (57.2% women)—in the HRS, and 2,463 individuals (54.1% women)—in the NUJLSOA. In all countries grip strength was substantially higher in men compared with the same-aged women (Fig. 2). The sex gap was higher in younger age groups and diminished with increasing age in the US and Denmark. In Japan, the sex difference in grip strength was almost constant across all age groups. Although grip strength declined with increasing age in all three studies, the cross-sectional trajectories suggest that the decline was steeper among men in Denmark and the US than in women within each country, but it was similar in Japanese men and women. Cross-country comparison of grip strength suggested that Japanese men and women had the lowest grip strength at all ages, while Danes had higher than or similar to the grip strength among US men and women. However, differences in testing equipment, testing position and protocol create difficulties for within-sex comparison of grip strength across Denmark, Japan and the US and, thus, the results should be interpreted cautiously (Fig. 2).

Fig. 2
figure 2

Grip strength

Physical disabilities

The analysis of sex differences in physical disability score in Danish surveys was based on 5,221 (68.3% women) individuals aged at least 70 years. Valid measurements on physical disabilities were available 17,930 (58.3% women) in HRS and for 3,374 (55.6% women) in the NUJLSOA. Disability scores were lower among Danish men than women aged 70–74, 75–79, and 80–84, but women had higher disability scores at the older ages (Fig. 3). Similarly, in Japan at younger age groups men had similar or higher levels of physical disability compared with women, but the female disadvantage in physical function became apparent in the 85–89 and 90+ age groups. In the HRS, men reported fewer physical disabilities compared with women at all ages. The country-specific trajectories of physical disability indicate that the female disadvantage was larger among 85+ years old individuals in all three studies, especially in Japan (Fig. 3).

Fig. 3
figure 3

Physical disabilities and depression

Within-sex cross-country comparison revealed that the level of physical disabilities was lowest among Danes, although Japan-Denmark difference was not always statistically significant in men and women. Japanese and US men had similar levels of physical disability, but Japanese women were significantly less disabled than their same-sex US counterparts (P-value < 0.05).

Depression

In total, 10,472 (42.9% women) Danes, 16,539 persons (59.3% women) from the HRS, and 3,020 (54.2% women) NUJLSOA participants had valid depression scores. Depression scores tended to be higher in women than in men at almost all ages in Denmark, the US and Japan (Fig. 3). In Japan, men had slightly higher depression levels than women in the 75–79 and 80–84 age groups, but the direction of the sex difference in depression reversed among individuals aged 85+ with women having higher depression scores than men. However, the sex difference failed to reach statistical significance at all ages. In general, depression increased with advancing age in the Danish and Japanese study populations. In the US depression decreased until 65–69 in women and 70–74 in men and further increased until the oldest age group. However, there was no apparent sex-specific pattern in the trajectories of depression over time in all three countries.

Self-rated health

Self-rated health was available for 10,583 Danes (57.1% women), 17,852 persons (58.4% women) in the HRS, and 3,028 individuals (54.3% women) in the NUJLSOA. Men and women rated their general health similarly in all three countries (Fig. 4). SRH declined with advanced age and the trajectories of decline were similar in men and women in the three populations.

Fig. 4
figure 4

Self-rated health

Danes reported significantly better self-rated health compared with the same-sex HRS (P-value < 0.01) and NUJLSOA participants (P-value < 0.01). Within each gender self-rated health was similar in Japan and the US (Fig. 4).

Cognitive function

The analysis of sex differences in immediate recall was based on 10,313 participants of the Danish surveys, 16,481 persons (59.4% women) from the HRS, and 2,869 (54.1% women) NUJLSOA participants. The Danish data showed that women had better performance on the immediate recall test compared with men in all age groups (Fig. 5). Although the sex difference was statistically significant until 80 years, its magnitude was very small (1.02–5.48%) corresponding approximately to the recall of one word. Similarly, in the HRS, women had significantly more words recalled at every age; except the oldest old men and women had similar immediate recall scores. The smallest sex differences were observed in the NUJLSOA (0.3–2.7%), although the 65–79 years old women had significantly better performance on immediate recall tests compared with the same-aged men. In the oldest ages Japanese men had more favorable performance than their female counterparts, which is probably due to the small numbers and selectivity of very old participants in the NUJLSOA. The performance on the immediate recall test declined with advanced age in all three countries. The trajectories of immediate recall scores were similar in men and women in Denmark, the US, and Japan (Fig. 5).

Fig. 5
figure 5

Immediate recall

Within-sex comparison across countries revealed that immediate recall score was significantly lower in Denmark than in Japan (P-value < 0.05) and US (P-value < 0.01) counterparts in both genders and at all ages. Men and women in the US had the best performance on the immediate recall test (P-value < 0.01); only in the 75–79 and 90+ age groups Japanese and US men had similar levels of cognitive function. The lowest immediate recall score among Danes is likely to be partially due to methodological differences across the studies.

Valid MMSE score was obtained for 6,216 individuals (37.4% women) in Denmark and 814 persons (60.6% women) in the HRS. US women had higher MMSE scores than men at all ages, whereas the reverse pattern was observed in Denmark (Fig. 6). However, sex differences in MMSE scores in the US and among middle-aged and younger-old Danes were not statistically significant. Although 80 years and older Danish men had significantly higher MMSE scores compared with the same-aged Danish women, the magnitude was very small. Generally, US women had higher MMSE scores compared with Danish women, but statistically significant difference was indicated in the 75–59 and 80–84 age groups. Danish and US men had similar MMSE scores, except the oldest-old US men had significantly higher cognitive function compared with the same-aged Danish men.

Fig. 6
figure 6

Mini-Mental State Exam

Discussion

The present study revealed consistent sex differences in mortality, handgrip strength and physical disabilities across Denmark, Japan, and the US. It showed also that women are inclined to have higher depression scores than men at the same ages and that women and men have similar SRH and cognitive function in the three studies.

Consistent with previous research reports, the present study revealed the highest sex ratio in mortality in Japan. Other studies have documented that at the end of the twentieth century Japan had the highest sex differential in life expectancy compared with other G7 countries expect France [3, 5]. Such a divergent pattern of the sex difference in life expectancy in Japan from most other industrialized countries was shown to be due to relatively small improvements in mortality from circulatory diseases and increasing mortality from cancer and external causes among Japanese men compared with their female counterparts. Sex differences in the diffusion of cigarette use within the population may underlie widening sex differential mortality in Japan. Due to an extreme scarcity of cigarettes in Japan at the time and shortly after World War II, cigarettes were rationally distributed among men aged at least 20 years, whereas smoking among women was socially unacceptable and was observed only among old aged women [40, 41]. With post-war economic growth the prevalence of tobacco use increased substantially from 1950s to 1980s although with lower rates and older age of initiation among women [42]. Despite the adoption of healthy lifestyles by younger generations in recent years, cigarette use remains high among Japanese men and it stays at low levels among women, 54.4 versus 8.1% in men and women, respectively [40, 41].

Similar to other studies [12, 43], we observed a substantial male advantage in handgrip strength across three countries. Our study suggests that the age-related decline in grip strength may be larger among men than among women in Denmark and the US, but it is similar in Japanese men and women. Investigators studying sex differences in the rate of decline in grip strength have drawn conflicting conclusions. While some studies found that grip strength declines at a faster pace among men than among women [13], others suggested that the rate of decline in women is greater than that in men [44]. Although differences in anthropometric characteristics, testing equipment, testing position and protocol impede cross-country comparison of grip strength in the present study, our finding of highest grip strength in Denmark is consistent with better self-reported health and a tendency toward lower physical disability levels among Danes compared with the same-sex HRS and NUJLSOA participants. In addition, it has been previously reported that Danish nonagenarians and centenarians had higher grip strength than French and Italian elderly even in multivariate analysis [12]. Substantial geographical differences in grip strength were seen in European countries with the highest scores in northern and continental (SE, DK, NL, DE, AT, CH, FR) countries and the lowest in southern countries (ES, IT and GR) [45].

Our analysis revealed only small sex differentials in ADL among young-old individuals in Denmark and Japan, but the US women tended to have more disabilities compared with men at all ages. The female disadvantage in physical function became pronounced with advancing age in the three countries. A recently published cross-country comparison study of gender differences in health at ages 50 years and older in 11 European countries, England and the US has also reported that the sex differences were less pronounced in the abilities to provide self-care, although female disadvantage was apparent in the instrumental activities of daily living [46]. Other studies have also indicated marked sex differences in functional disability among very old individuals and widening sex gap with increasing age [47].

Our findings of no substantial sex differential in SRH and its age-related decline in all three countries run counter to the results from several cross-national comparison studies. In most EU countries, on average, women reported worse health than men [8, 48]. However, some studies revealed no or negligible sex differences in general health in Japan, US, and Denmark [4951]. Although cross-national differences in self-rated health may reflect also cultural differences in backgrounds, response categories and health perceptions [21], it has been previously reported that Danes had the highest life satisfaction among the 15 Eurobarometer countries from 1975 to 2005 [52].

The present study indicated that, generally, women had higher depression scores in all three countries, although the sex difference was not always statistically significant. A decline in depression observed in the oldest age group in the NUJLSOA is likely to be due to small number of participants in the oldest age groups, 35 men and 70 women. Also, there were no apparent sex-specific patterns in the trajectories of depression over time in either country. Although there is a mounting research literature pointing towards substantially higher prevalence of depression in women compared with men from young until early old age [53], the female predominance of depression at older ages is still debatable. Some studies indicated diminished sex difference in depression after menopause [54], while others reported persistent female preponderance at older ages as well [55]. Current knowledge of age-related changes in the prevalence of depression is also inconclusive [56].

Our results suggested small or no sex differential in cognitive functioning as indicated by the immediate recall test in Denmark, the US and Japan. Although women, generally, did statistically better than men at younger ages, the sex difference was very small and, thus, likely to be unimportant in everyday life. The performance declined with advanced age in the three countries, but within each country the trajectories of age-related changes in the immediate recall score were similar in men and women. Cross-sectional and longitudinal studies have revealed contradictory results on sex differences in cognitive abilities. Some studies found that women performed better on verbal memory tasks, but men had higher scores on spatial orientation tasks or no sex differences were indicated in non-verbal reasoning [57, 58]. Others found statistically significant female disadvantage in episodic memory, perceptual speed, and digit span, but, like in our study, its magnitude was small [59].

Finally, our data suggested that the sex differential in the MMSE score and its age-related decline was also very small both in Denmark and the US. There seems to be little agreement on the sex differential in cognitive decline and in incidence and prevalence rates of dementia [60]. Some studies showed no sex difference in the level and age-related changes in cognitive function [61, 62] and dementia [63]. Other researchers, however, reported higher incidence rates of Alzheimer’s disease among women [64].

Our study indicated substantially lower scores on immediate recall test in Denmark than in Japan and the US for both genders. Danish women had also lower MMSE scores compared with their US counterparts. Methodological differences in the immediate recall test across the countries may partially explain lower performance among Danes than among their US and Japanese counterparts because it is more difficult to recall a list of 12 words than a list of 10 words. In addition, the lower levels of cognitive function in Denmark than in the US and Japan may also be partially due to educational differences across three study samples. In all Danish surveys combined 7–10% of participants had higher education (11 and more years), whereas the corresponding figure in the HRS was 85%.

Prior research suggested that cross-country differences in health may be partly attributed to the differences in welfare states [65], socioeconomic development, [66] income inequality [67], as well as cultural differences in perception of health and response styles [68]. Nevertheless, within each welfare regime the prevalence of poor/fair general health and longstanding illness was higher among women than in men in all countries, except Ireland and UK with the Anglo-Saxon welfare regime, Finland with Scandinavian welfare system, and the Netherlands and Luxemburg with Bismarckian welfare [65].

One key concern in cross-national comparison studies is the comparability of available data. Although some health domains in the HRS, the NULSOA and Danish surveys were measured in comparable but not identical ways, intra-country comparison of sex differences in health is less sensitive to the differences in data collection instruments. Further, although no covariates other than age were considered in the present analysis, several studies demonstrated that female disadvantage in morbidity, e.g. chronic conditions and psychological distress, persisted even after adjustment for age and socio-economic status [69]. Future studies should focus on sex differences in the level and rate of decline in grip strength and physical disabilities using longitudinal data collected in Denmark, Japan and the US, as well as on more detailed analyses of the cross-national differences in health.

In conclusion, the present study suggests that sex differences in the all-cause mortality, grip strength and physical disability at the middle and old ages were consistent across countries, whereas the pattern of gender differences in depression and its age-related trajectories were country-specific. Additionally, within each country women and men had similar self-rated health and cognitive function. It is remarkable that in all three countries regardless of substantial sex differences in mortality and dimensions of physical health, no sex gap was indicated in self-reported health. It suggests that the assessment of general health include the comparison of own health with that of same-sex contemporaries of the same age.