Article Text

Original research
Validity of self-reported diagnoses of gynaecological and breast cancers in a prospective cohort study: the Japan Nurses’ Health Study
  1. Kiyoshi Takamatsu1,
  2. Yuki Ideno2,
  3. Mami Kikuchi3,
  4. Toshiyuki Yasui4,
  5. Naho Maruoka5,
  6. Kazue Nagai5,
  7. Kunihiko Hayashi5
  1. 1Department of Obstetrics and Gynecology, Tokyo Dental College Ichikawa General Hospital, Ichikawa, Japan
  2. 2Center for Mathematics and Data Science, Gunma University, Maebashi, Japan
  3. 3Center of Regional Medical Research and Education, Gunma University Hospital, Meabashi, Japan
  4. 4Department of Reproductive and Menopausal Medicine, University of Tokushima Graduate School of Biomedical Sciences, Tokushima, Japan
  5. 5Department of International/Community Health Laboratory Sciences, Gunma University Graduate School of Health Sciences, Gunma University, Maebashi, Japan
  1. Correspondence to Dr Kiyoshi Takamatsu; ktakamatsu{at}tdc.ac.jp

Abstract

Objectives To validate the self-reported diagnoses of gynaecological and breast cancers in a nationwide prospective cohort study of nursing professionals: the Japan Nurses’ Health Study (JNHS).

Design and setting Retrospective analysis of the JNHS.

Participants and measures Data were reviewed for 15 717 subjects. The mean age at baseline was 41.6±8.3 years (median: 41), and the mean follow-up period was 10.5±3.8 years (median: 12). Participants are regularly mailed a follow-up questionnaire once every 2 years. Respondents who self-reported a positive cancer diagnosis were sent an additional confirmation questionnaire and contacted the diagnosing facility to confirm the diagnosis based on medical records. A review panel of experts verified the disease status. Regular follow-up, confirmation questionnaires and expert review were validated for their positive predictive value (PPV) and negative predictive value (NPV).

Results New incidences were verified in 37, 47, 26 and 300 cervical, endometrial, ovarian and breast cancer cases, respectively. The estimated incidence rates were 22.0, 25.4, 13.8 and 160.4 per 100 000 person-years. These were comparable with those of national data from regional cancer registries in Japan. For regular follow-up, the corresponding PPVs for cervical, endometrial, ovarian and breast cancer were 16.9%, 54.2%, 45.1% and 81.4%, and the NPVs were 100%, 99.9%, 99.9% and 99.9%, respectively. Adding the confirmation questionnaire improved the PPVs to 31.5%, 88.9%, 76.7% and 99.9%; the NPVs were uniformly 99.9%. Expert review yielded PPVs and NPVs that were all ~100%.

Conclusions Gynaecological cancer cannot be accurately assessed by self-reporting alone. Additionally, the external validity of cancer incidence in this cohort was confirmed.

  • epidemiology
  • breast tumours
  • gynaecological oncology

Data availability statement

The data are not publically available due to data transfer agreements.

http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • This study investigated the validity of self-reporting of gynaecological and breast cancers in a large, nationwide prospective cohort study of nursing professionals, the Japan Nurses’ Health Study (JNHS).

  • Participants of JNHS cohort, which was composed entirely of female nursing professionals, are likely to answer the cancer history more accurately than general population.

  • Periodic questionnaires, meticulous review of subjects’ medical records and death certificate surveys were employed to establish self-report validity, circumventing the limitations presented by Japan’s lack of complete national cancer registries.

  • Not all answers for confirmation questionnaire were obtained.

  • There was relatively small number of young participants in this cohort.

Introduction

Self-reporting is frequently used to assess disease status in cohort research. The methodology’s cost-effectiveness and feasibility make it an attractive approach in countries without comprehensive national disease registries such as Japan. However, the unreliability of self-reported information is problematic and can introduce errors into epidemiological investigations of risk factors, especially for new cancer incidences in a cohort. Self-reporting appears to accurately reflect diabetes status and surgical history of hysterectomies;1 2 however, body weight is often under-reported.3 Regarding patients’ cancer history, healthcare providers must consider that an affirmative response on a questionnaire is not equivalent to a definitive medical diagnosis because patients may remember incorrectly. Ideally, their answers should be corroborated against their medical records, but these typically cannot be acquired for an entire cohort. Additionally, validity can depend on background factors, such as ethnicity and cohort-specific characteristics, which further complicates interpreting self-report data. In this sense, validation of self-reported diagnoses of gynaecological and breast cancers is not clear in Japan.

The Japan Nurses’ Health Study (JNHS) is a nationwide prospective cohort study of over 15 000 female nurses, which began in 2001 to ascertain how women’s health is affected by lifestyle factors, healthcare practices and physical status over their lifetime.4 Here, we investigated the validity of self-reported diagnoses of three gynaecological cancers (ie, cervical, endometrial and ovarian) and breast cancer in our cohort. Also, we checked the external validity of our cohort by confirming the cancer incidence.

Methods

Subjects

The JNHS is an ongoing prospective cohort study investigating the association between lifestyle, healthcare practices and women’s health in Japan. Detailed information on its design, population, protocol and sample size calculations were published previously.4 5 Briefly, the baseline survey was conducted from 2001 to 2007, with planned follow-up for 30 years. In total, 15 019 women agreed to follow-up, signing and returning the informed consent form with the completed survey. At the time of the baseline survey, the study population consisted of female licenced nursing professionals, including registered nurses, licenced practical nurses, public health nurses and midwives, aged ≥25 years and residing in Japan. Follow-up is currently ongoing; subjects are regularly mailed a self-administered questionnaire once every 2 years to complete and return by post.

Before initiating the JNHS, the feasibility of its research strategy and the validity of its questionnaires were investigated and confirmed in a pilot cohort study started in 1999 (the Gunma Nurses’ Health Study (GNHS); n=698).6 7 We combined the JNHS and GNHS datasets in the present work as JNHS cohort (n=15 717). Table 1 shows the number of subjects in each age group. Women had a mean age at baseline of 41.6 (8.3) years (mean (SD); median: 41 years) and a mean follow-up of 10.5 (3.8) years (median: 12 years).

Table 1

Numbers and percentages of subjects in each age group at baseline in the JNHS cohort

The JNHS Coordination and Data Center is located in the Epidemiological Research Office of the School of Health Sciences at Gunma University. This study was performed under the Declaration of Helsinki, the Guidelines for Good Epidemiology Practices8 and the Japanese Ethical Guidelines for Epidemiological Research.9

Data collection and corroboration

In the baseline and regular biennial follow-up questionnaires, women were asked, ‘Have you ever been diagnosed with breast cancer (cervical cancer, endometrial cancer, or ovarian cancer) by a medical doctor?’, and if so, what was their age at first diagnosis. We identified and isolated those women who self-reported new incidences of one of the cancers of interest in the regular follow-up by July 2017.

To corroborate the self-reported positive cases, an additional confirmation questionnaire was sent to those women who affirmed a new cancer diagnosis in the regular follow-up. Subjects were again asked the same question as above and to provide details about their date of/age at diagnosis, method of detection, tumour stage and treatment history. We also asked for permission to access their medical records; if they consented, we reviewed the records to obtain accurate clinical information on their condition. For gynaecological cancers, the data collected included date of diagnosis, clinical stage, histological type, treatments and concomitant cancer(s). For breast cancer, the data included date of diagnosis, tumour site, invasivity, tumor–node–metastasis classification (Union for International Cancer Control, 7th edition),10 diagnostic method(s), tumour size, mammography category, surgical procedure, histological classification and pathological classification (ie, regional lymph node involvement and hormone receptor positivity for oestrogen receptor, progesterone receptor and human epidermal growth factor receptor type 2). This clinical information was furnished to an expert review panel comprising specialists on gynaecological and breast cancers to verify each self-reported positive diagnosis.

In Japan, the clinical reporting of gynaecological cancers follows the Japan Society of Obstetrics and Gynecology (JSOG) staging system, which is based on the internationally recognised surgical staging system published by the International Federation of Gynecology and Obstetrics (FIGO). When the FIGO criteria were updated during the study period in 2011,11 the JSOG system was revised in tandem to remove stage 0 lesions from the corresponding definitions, that is, cervical carcinoma in situ (CIS) and atypical endometrial hyperplasia from cervical and endometrial cancer, respectively. Therefore, stage 0 cancers were not considered positive in our primary analysis, and all medical records were double-checked for patients who self-reported a new incidence of gynaecological cancer before 2011. These borderline cases were excluded.

If a subject was reported as deceased or inexplicably failed to complete any recent study activities, we established a cause of death by checking it against death certificate related information in Japan’s National Vital Statistics database.

Validation

Regular follow-up, confirmation questionnaires and expert review were validated for their positive predictive value (PPV) and negative predictive value (NPV) for new incidences of each cancer.

For the first two sources, the validation sample included all members of the study cohort (n=15 717) who reported no history of the cancer in question at baseline. The PPV of the regular follow-up was calculated as the number of verified positive cases of the cancer, that is, cases whose self-reported positive diagnosis was verified by medical-record review or cause-of-death investigation, divided by all cases of self-reported new incidences of the cancer in the regular follow-up. The NPV was calculated as the number of suspected negative cases, divided by all members of the validation sample who self-reported no new cancer incidence in the regular follow-up. Here, the suspected negative cases consisted of all members of the validation sample for the cancer in question minus: (A) cases who self-reported new incidences in the regular follow-up and (B) positive cases whose status was established only by death certificate only (DCO).

The PPV of the combined regular follow-up and confirmation questionnaire was calculated as the number of verified positive cases of the cancer divided by all cases who corroborated their positive diagnosis on the confirmation questionnaire. The NPV was calculated as the number of suspected negative cases divided by all members of the validation sample except those who self-reported their positive diagnosis on the confirmation questionnaire. Here, the suspected negative cases consisted of all members of the validation sample minus: (A) cases who self-reported their positive diagnosis on the confirmation questionnaire, (B) cases ruled positive by DCO, (C) cases ruled positive by cause-of-death investigation and (D) contradictory cases (ie, women confirmed by expert review but self-reported a negative status on the confirmation questionnaire or left the field blank).

The expert review panel’s judgments were also validated for comparison. In this analysis, the validation sample consisted of all participants who: (A) returned the confirmation questionnaire, (B) permitted the research team to contact their diagnosing facility and (C) their provider agreed to respond to the team’s inquiry. The PPV was calculated as the number of cases verified as positive by the diagnosing facility, divided by the number of cases ruled positive by the expert review panel. The NPV was calculated as the number of cases verified as negative by the diagnosing facility, divided by the number of cases ruled negative by the panel.

After fixing the cancer cases, the incidence rate of each cancer was estimated from the observed events and person-time at risk for 10 years of observation. Because of the numbers of participants aged <30 and ≥60 years were small, the 30–60 years old age group was used. We calculated the 95% CIs of the incidence rates based on the exact Poisson CI in accordance with known methods.12

Patient and public involvement

This research was done without involving participants in defining the research question, outcome measures or study design. Participants were recruited with the study information to nursing society. They were not invited to comment on the design and to interpret the results and were not invited to contribute to the writing or editing of the manuscript. The results will be reported to participants in the JNHS newsletter and also be posted on the website of JNHS.

Results

Verified cases of each cancer type

The flow diagram illustrating the validation process of each cancer is listed in the web appendices (online supplemental appendix 1). The numbers of new cases of self-reported cancers in the regular follow-up (and incidences in the respective validation sample) were cervical cancer: 219 (1.4%), endometrial cancer: 83 (0.5%), ovarian cancer: 51 (0.3%) and breast cancer: 365 (2.3%). New incidence was verified by expert review in 37, 45, 23 and 297 of these cases, respectively. Some subjects sent the confirmation questionnaire corroborating their positive diagnosis but were ruled negative by the expert panel (72.1%, 11.1%, 30.3% and 1.0%, respectively), while 37.6%, 33.8%, 25.6% and 8.3% of subjects, respectively, responded with negative diagnosis on the confirmation questionnaire.

For all observed cases of mortality, cause of death was established as being cervical cancer (n=4, DCO=0), endometrial cancer (n=7, DCO=2), ovarian cancer (n=3, DCO=3) or breast cancer (n=16, DCO=3). New incidences of the four cancers were verified in 37, 47, 26 and 300 cases, respectively.

In the JNHS cohort, the estimated incidence rates for patients aged 30–60 years were 22.0/100 000, 25.4/100 000, 13.8/100 000 and 160.4/100,000 person-years for cervical, endometrial, ovarian and breast cancer, respectively (table 2). Considering the lack of heterogeneity between this cohort and Japanese women overall, the incidence rates for each age group were compared with the national data from regional cancer registries in the 2015 statistics published by Japan’s National Cancer Center13 (figures 1–4). For all four cancers, the cohort data did not deviate from the national data.

Figure 1

Estimated incidence rates of cervical cancer for each age group in the JNHS cohort and the national data from regional cancer registries (error bars show the 95% CIs). JNHS, Japan Nurses’ Health Study.

Figure 2

Estimated incidence rates of endometrial cancer for each age group in the JNHS cohort and the national data from regional cancer registries (error bars show the 95% CIs). JNHS, Japan Nurses’ Health Study.

Figure 3

Estimated incidence rates of ovarian cancer for each age group in the JNHS cohort and the national data from regional cancer registries (error bars show the 95% CIs). JNHS, Japan Nurses’ Health Study.

Figure 4

Estimated incidence rates of breast cancer for each age group in the JNHS cohort and the national data from regional cancer registries (error bars show the 95% CIs). JNHS, Japan Nurses’ Health Study.

Table 2

Estimated incidence rate of each cancer in patients aged 30–60 years in the JNHS cohort

Self-reported PPV/NPV for each cancer

Table 3 summarises the PPVs and NPVs for the regular follow-up, regular follow-up plus confirmation questionnaire and expert review for the new incidence of each cancer. Expert review achieved 100% accuracy for each cancer except cervical (PPV: 92.3%) because of a single false-positive case, which the participant’s provider clarified to be a different condition.

Table 3

PPVs/NPVs of regular follow-up, regular follow-up plus confirmation questionnaire and expert review for new incidences of gynaecological and breast cancers in the JNHS cohort

Self-reporting achieved NPVs near 100% for all cancers for both the regular follow-up and the regular follow-up plus confirmation questionnaire. However, the corresponding PPVs tended to be somewhat lower and variable across cancers. The PPVs were worse for gynaecological cancers than for breast cancer (breast > endometrial > ovarian > cervical, in descending order) for both follow-up sources. The PPV for uterine cancer, which included cervical and endometrial cancers, was 27.2%.

The regular follow-up plus confirmation questionnaire achieved higher PPVs in all cases than did regular follow-up alone; however, while it achieved 99.0% accuracy for breast cancer, the estimates were lower for endometrial (88.9%) and ovarian (76.7%) cancers and poor for cervical cancer (31.5%).

Considering the changes to the official JSOG clinical staging system during the survey period, we calculated a similar summary for PPVs and NPVs, adding cases of cervical CIS, atypical endometrial hyperplasia and borderline ovarian tumours (table 4).

Table 4

Corresponding PPVs/NPVs including cervical CIS, atypical endometrial hyperplasia and borderline ovarian tumours in the JNHS cohort

The resulting PPVs were uniformly higher when all three cancers were included than when they were excluded. For endometrial and ovarian cancer, the improvements ranged from 3.3% to 6.7%, but for cervical cancer, their inclusion almost doubled the predictive value for both the regular follow-up and regular follow-up plus confirmation questionnaire, at +20.1% and +37.9%, respectively.

Discussion

In the JNHS cohort, self-reporting in regular follow-up achieved a PPV of 81.4% for breast cancer but performed poorer for gynaecological cancers, especially uterine cancers (PPV: 27.2%) and cervical cancer alone (PPV: 16.9%). Our PPVs were higher than the corresponding values reported by the Japan Public Health Center Study, a population-based prospective cohort study (all cancers in women: 54.2%, breast: 58.4%, uterine: 21.7%).14 The validity of self-reporting is associated with individual characteristics,15 and our cohort consisted entirely of nursing professionals. While evidence suggests that educational level has a negligible association with validity,16 we partially attribute the high self-reporting accuracy to the uniformly high level of medical education and deeper knowledge of cancer in our cohort than in the general population. Other studies support this argument.17 However, sizeable percentages of nurses who affirmed new incidences of cancer in the regular follow-up gave the opposite response on the confirmation questionnaire (gynaecological cancer: 25.6%–37.6% and breast cancer: 8.3%). Similarly, considerable percentages of respondents to the confirmation questionnaire were verified not to have cancer (gynaecological cancer: 41.2%–81.2% and breast cancer: 9.2%). Many who corroborated their self-reported positive diagnosis were eventually ruled negative by expert review, especially for cervical cancer (72.1%), followed by ovarian (30.3%), endometrial (11.1%) and breast (1.0%). In summary, self-reporting alone apparently fails to capture the real cancer incidence, even for this cohort of nursing professionals with uniformly high medical knowledge. Additional inquiries to confirm the details are needed.

Compared with PPVs of self-report validity in other prospective cohort study datasets,16 18 19 our PPVs were comparable with the literature values for breast cancer but lower than these values for uterine cancers. Many studies have shown that self-reporting of breast cancer has high PPVs.10 16 19 Some evidence has linked higher educational levels with a greater risk of breast cancer,20 which may also be true for our cohort. Additionally, breast cancer diagnoses included ductal carcinoma in situ, which may have led to less confusion than with gynaecological cancers that excluded stage 0 cases and borderline tumours.

Studies outside of Japan have also found self-reporting to yield lower PPVs for uterine cancers than for other cancers,18 21 for several possible reasons. One is inaccurate memory of precancerous cervical lesions, which are rarely addressed immediately by surgical intervention. Additionally, age and sex may have some association; for example, participants >50 years old in a Native American cohort were more likely to report incorrectly.22 Furthermore, a study from Australia found that self-reported breast cancer had lower PPVs in women aged 70–75 years.23 Disease-specific considerations may also be relevant. One study noted that many cases of women’s cancers, especially cervical cancer, are not recorded in cancer registries,22 while another estimated false-negative rates of 43.8%, 28.6% and 20.8% for self-reports of uterine, ovarian and breast cancers, respectively.24 Differences in incidence must also be considered. Because gynaecological cancers are >5 times less prevalent than breast cancer, a difference of one case would produce a proportionally larger change in PPV.

One problem specific to Japan regarding the self-reporting of women’s cancers is how the results of cytological screening tests are reported for cervical and endometrial cancers. Today, Pap smear results are recorded using the Bethesda system, the standard international format, but these results previously followed a class-based system. Class II status, which shows within the normal range, is sometimes confused with stage II cervical cancer. Similarly, atypical endometrial hyperplasia was previously classified as stage 0 endometrial cancer, which may be confused with non-atypical endometrial hyperplasia.

We suspect that another reason the self-report validity in our cohort was so poor for certain cancers was that subjects were recalling their medical history during the regular follow-up, rather than the new incidence as intended. Additionally, ambiguous language in the questionnaire, such as ‘dysplasia’ or ‘precancerous lesions’, may have reduced the self-report validity, as evidenced by the higher PPVs for borderline forms, such as cervical CIS, atypical endometrial hyperplasia and borderline ovarian tumours included in the analysis. Among the three borderline forms, classifying cervical CIS as cervical cancer led to a greater increase in PPVs than did other cancers. Manjer et al25 also found that self-reporting of malignant cervical cancer was less sensitive when the definition included cervical CIS. These considerations suggest that compared with other cancers, diagnoses of cervical cancer and precancerous lesions have a greater risk of being inaccurately communicated or negatively interpreted by patients.

One of this study’s strengths was our meticulous review of subjects’ medical records and death certificate surveys to establish self-report validity, circumventing the limitations presented by Japan’s lack of complete national cancer registries. Additionally, we believe that our data better reflect the general Japanese population than did past findings for other regional cohorts because the nationwide scope of the JNHS minimises the geographical variation. Moreover, our cohort was relatively homogenous in terms of sex and occupation, consisting entirely of female nursing professionals.

The study also had some limitations. Cohort-specific characteristics may limit the generalisability of our findings, especially the relatively young skew of the participants’ ages. However, when converted to incidence rates, our rates seem most consistent with the 2015 statistics published by Japan’s National Cancer Center.13 Additionally, self-reported diagnoses could not be verified in some cases. Our expert panel made their judgments based on the specific language nurses used in the questionnaire to describe their treatments such as ‘hysterectomy’ and ‘chemotherapy’, but the panel still encountered cases that were difficult to definitively verify. However, we established a conclusive diagnosis based on all available information such as postmortem exam findings and supplemental details from primary care providers. No indeterminate cases were found among those lacking medical records for verification.

Conclusion

In Japan, gynaecological cancer also cannot be accurately assessed by self-reporting alone. However, external validity of these cancer incidences in JNHS with our method was confirmed. As the JNHS database covers all of Japan, these results allow the further investigation of risk factors for different cancers such as menopausal hormone therapy and lifestyle factors and their associations, with unaffected by information bias. We plan to continue our work by analysing the respective contributions of different risk factors among confirmed cases of gynaecological and breast cancer, as verified above.

Data availability statement

The data are not publically available due to data transfer agreements.

Ethics statements

Ethics approval

The GNHS study protocol was approved by the institutional review board of Gunma University, Japan (approval #3, 1999), and the JNHS study protocol was approved by the institutional review board of Gunma University, Japan (approval #101, 2001) and the ethics review board of Japan’s National Institute of Public Health, Japan (approval #03007, 2003).

Acknowledgments

The authors appreciate the cooperation of the Japanese nurses who participate in the Japan Nurses’ Health Study (JNHS) and the Gunma Nurses’ Health Study (GNHS). We would also like thank Ms Satomi Shimizu at the JNHS Data Center for her help with data management

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • Contributors KT was the panel member and wrote the initial draft of the paper to which all authors contributed. YI, NM and KN collected and analysed the data. MK and TY were panel members and revised the manuscript. KH designed the study, raised funding and directed its implementation including quality assurance and control.

  • Funding This work was partly supported by a grant from the Japan Society for the Promotion of Science (JSPS KAKENHI Grant Number: 18H04069).

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.