Article Text

Download PDFPDF

Cohort profile: Study of Transition, Outcomes and Gender (STRONG) to assess health status of transgender people
  1. Virginia P Quinn1,
  2. Rebecca Nash2,
  3. Enid Hunkeler3,
  4. Richard Contreras1,
  5. Lee Cromwell4,
  6. Tracy A Becerra-Culqui1,
  7. Darios Getahun1,
  8. Shawn Giammattei5,
  9. Timothy L Lash2,
  10. Andrea Millman6,
  11. Brandi Robinson4,
  12. Douglas Roblin7,
  13. Michael J Silverberg6,
  14. Jennifer Slovis8,
  15. Vin Tangpricha9,10,
  16. Dennis Tolsma4,
  17. Cadence Valentine1,
  18. Kevin Ward2,
  19. Savannah Winter4,
  20. Michael Goodman2
  1. 1 Department of Research & Evaluation, Kaiser Permanente Southern California, Pasadena, California, USA
  2. 2 Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, Georgia, USA
  3. 3 Division of Research, Kaiser Permanente Northern California (emerita), Oakland, California, USA
  4. 4 Center for Clinical and Outcomes Research, Kaiser Permanente Georgia, Atlanta, Georgia, USA
  5. 5 The Rockway Institute, Alliant International University, San Francisco, California, USA
  6. 6 Division of Research, Kaiser Permanente Northern California, Oakland, California, USA
  7. 7 School of Public Health, Georgia State University, Atlanta, Georgia, USA
  8. 8 The Permanente Medical Group, Kaiser Permanente Northern California, Oakland, California, USA
  9. 9 Emory University School of Medicine, Atlanta, Georgia, USA
  10. 10 The Atlanta VA Medical Center, Atlanta, Georgia, USA
  1. Correspondence to Dr Michael Goodman; mgoodm2{at}


Purpose The Study of Transition, Outcomes and Gender (STRONG) was initiated to assess the health status of transgender people in general and following gender-affirming treatments at Kaiser Permanente health plans in Georgia, Northern California and Southern California. The objectives of this communication are to describe methods of cohort ascertainment and data collection and to characterise the study population.

Participants A stepwise methodology involving computerised searches of electronic medical records and free-text validation of eligibility and gender identity was used to identify a cohort of 6456 members with first evidence of transgender status (index date) between 2006 and 2014. The cohort included 3475 (54%) transfeminine (TF), 2892 (45%) transmasculine (TM) and 89 (1%) members whose natal sex and gender identity remained undetermined from the records. The cohort was matched to 127 608 enrollees with no transgender evidence (63 825 women and 63 783 men) on year of birth, race/ethnicity, study site and membership year of the index date. Cohort follow-up extends through the end of 2016.

Findings to date About 58% of TF and 52% of TM cohort members received hormonal therapy at Kaiser Permanente. Chest surgery was more common among TM participants (12% vs 0.3%). The proportions of transgender participants who underwent genital reconstruction surgeries were similar (4%–5%) in the two transgender groups. Results indicate that there are sufficient numbers of events in the TF and TM cohorts to further examine mental health status, cardiovascular events, diabetes, HIV and most common cancers.

Future plans STRONG is well positioned to fill existing knowledge gaps through comparisons of transgender and reference populations and through analyses of health status before and after gender affirmation treatment. Analyses will include incidence of cardiovascular disease, mental health, HIV and diabetes, as well as changes in laboratory-based endpoints (eg, polycythemia and bone density), overall and in relation to gender affirmation therapy.

  • transgender
  • cohort
  • electronic medical records

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • Perhaps the most important strength of this study is systematic cohort identification without a need for participant opt-in.

  • Additional strengths inlude accurate determination of gender identity and comprehensive ascertainment of hormonal and surgical treatment received at Kaiser Permanente.

  • Notable weaknesses of the study include relatively short follow-up, dearth of information on gender affirmation treatment received outside of the health plans and lack of data on outcomes not captured in the medical records. 


Transgender people are a diverse group of individuals whose biological sex does not match their gender identity.1 Typically, sex is assigned at birth based on the appearance of the genitalia.2 In contrast, an individual’s gender identity is defined as being a male/man, female/woman or of a different gender.2 3 Many transgender people may not self-identify based on binary definitions;4 however, a person whose gender identity differs from a male natal sex assignment is often referred to as male-to-female or trans woman, and a person whose gender identity differs from a female natal sex is often referred to as a female-to-male or trans man.5 6 More recently, the terms transfeminine (TF) and transmasculine (TM) have become preferred as they also apply to individuals who do not identify with binary gender categories.7

Transgender individuals sometimes seek medical gender affirmation, which may involve administration of cross-sex hormone therapy (HT) to achieve desired masculinisation or feminisation, and/or surgical change of the genitalia and other sex characteristics.8 9 Although several organisations have established guidelines for clinical care of transgender patients,9 10 many issues in transgender health and gender affirmation therapy remain unresolved due to lack of direct evidence.11 Consequently, many current practices and standards of care are based on expert opinion, case reports or extrapolation of research findings from other populations. For example, current recommendations regarding risk of venous thromboembolism in TF patients receiving oestrogen are based on the observed effects of hormone replacement therapy in postmenopausal women.12 13 Similarly, expected health risks in TM are inferred from comorbidities associated with polycystic ovary syndrome.14

Critical knowledge gaps include the effect of HT and surgery on gender dysphoria (the feeling of distress when natal sex does not match gender identity15) and other mental health issues, haematological side effects of HT and risk of cardiovascular disease, metabolic or endocrine disorders and cancer following hormonal or surgical gender affirmation.16 A direct evaluation of some of these issues requires longitudinal studies with large numbers of TM and TF participants with sufficient follow-up and variable history of surgical interventions and cross-sex HT use.17 These considerations motivated the design of the longitudinal cohort for ‘Study of Transition, Outcomes and Gender (STRONG)’. This paper provides a summary of the challenges facing transgender health research, describes the main elements of the STRONG study design and data collection and discusses lessons learnt during the implementation of this project. In this ‘cohort profile’ communication, we offer a detailed documentation of methods used to assemble, validate and characterise the STRONG cohort and offer an overall description of the study population that will provide data for a multitude of subsequent hypothesis-testing studies.

Methodological challenges facing transgender health studies

The methodological challenges in observational studies of transgender health fall into five categories: (1) attaining sufficient sample size and statistical power; (2) systematic and comprehensive identification of eligible study participants and comparable reference groups; (3) determination of natal sex and/or gender identity; (4) assessment of current and past gender affirmation treatment and (5) engagement of patient and physician stakeholders at all stages of research.17

Transgender people represent a hard-to-reach population, and to date most existing cohorts assessing transgender health were assembled in specialised clinics that provide gender affirmation care.18 19 This approach provides good options for collecting detailed treatment data and biospecimens but may exclude individuals who have not sought or who have already completed treatment, and makes it difficult to select comparable reference groups.20 In addition, establishing a large clinic-based cohort requires coordination across multiple sites, is costly and may still have small numbers resulting in studies of relatively low statistical power.17

Besides clinic-based studies, previous efforts to identify transgender individuals for health research also involved population surveys21 and reviews of electronic records for relevant International Classification of Diseases (ICD) codes.22 23 While surveys offer generalisable population estimates, they may be affected by recall bias and low response rates and require large-scale efforts to identify sufficient numbers of transgender people. Reliance on ICD codes allows identifying large samples from insurance claims or electronic medical records (EMRs), but may exclude eligible study participants who do not have or do not wish to receive a transgender-specific diagnosis.24

A critical aspect of transgender research is accurate identification of gender identity. In recent years, the US Department of Health and Human Services issued a directive that EMR systems should enable providers to record gender identity and sexual orientation.25 This directive should improve documentation over time; but in the meantime, determination of TF or TM status presents a methodological challenge because the available demographic data can reflect natal sex or gender identity, without specifying which is which. Assessing TM/TF status can be achieved by asking two questions about natal sex and gender identity26; however, reliance on self-report requires contact with individual participants and is subject to non-response, which increases the risk of selection bias.

Insurance coverage for gender affirmation therapy has increased over time27–29; however, it remains sporadic and incomplete, and many transgender people seek care outside of regular healthcare plans.30 This presents a specific challenge for any study that aims to ascertain hormonal exposures and surgical procedures, particularly for transgender people with no or inadequate insurance coverage and for persons who initiated gender-affirming therapy years ago.

In a rapidly developing field, such as transgender health, even experienced researchers may lack specific expertise required to prioritise research questions and select the most relevant patient-centred outcome measures. Thus, it is important to involve members of the transgender community and their physicians to ensure that the proposed research questions, study design and data collection methods are relevant, appropriate and feasible.31 Conversely, as the research methods and interpretation of findings are becoming increasingly complex, researchers should communicate with stakeholders directly to convey information relevant to their decision-making.32

In sum, transgender health research faces significant methodological challenges and logistical barriers. These challenges and barriers contribute to the lack of knowledge about the health risks in this population and preclude development of evidence-based recommendations for transgender healthcare.11

Cohort description

Study goals, design and setting

STRONG was initiated in September 2013 with the primary long-term goal of assessing the health status of transgender individuals overall, among TF/TM subgroups, and in both subgroups after different types of gender-affirming treatment. It was designed as an EMR-based retrospective/prospective cohort study of transgender members enrolled in three Kaiser Permanente (KP) health plans located in Georgia (KPGA), Northern California (KPNC) and Southern California (KPSC). These health plans are prepaid integrated care systems and currently provide comprehensive health services to approximately 8 million members. Individuals and their families may enrol through an employer, state or federal programmes such as Medicaid and Medicare, or directly. The populations of enrollees are sociodemographically diverse and broadly representative of the communities in the corresponding areas.33 34

The three KP organisations are members of several research consortia including the Healthcare Systems Research Network35 and the Mental Health Research Network.36 They share similarly structured databases organised into ‘Virtual Data Warehouses’ (VDW), with files stored behind security firewalls at each site. The files have identical variable names, formats and specifications that allow using centrally generated and/or distributed programs to create harmonised analytic datasets.37 VDW files are linked by unique individual identifiers, allowing researchers to construct historical and prospective cohorts.38

The study was conducted in partnership with the Emory University, which served as the coordinating centre. All activities described in this manuscript were reviewed and approved by the Institutional Review Boards (IRBs) of the four institutions.

The stakeholder involvement was achieved by assembling a project advisory group. To recruit the Stakeholder Advisory Group members, investigators at each site were charged with identifying a leading clinician specialising in transgender care. The physicians were then asked to nominate one or two patients to serve as representatives of the transgender community. The resulting Stakeholder Advisory Group included 12 members. The stakeholders made a number of important contributions to the project at various stages of study design, planning and implementation, as described below.

Cohort ascertainment

Figure 1 shows the three-step algorithm used to identify transgender cohort members. It includes initial EMR search to identify cohort candidates (step 1), validation of transgender status (step 2) and determination of TM/TF status (step 3).

Figure 1

STRONG transgender cohort ascertainment flow diagram. EMR, electronic medical record; ICD-9; International Classification of Diseases, Ninth edition; STRONG, Study of Transition, Outcomes and Gender; TF, transfeminine; TM, transmasculine.

Step 1: Initial EMR search

A computer programme written using Statistical Analysis Software (SAS) V.9.4 (SAS Institute Inc., Cary, NC) was used to search the EMRs of KPGA, KPNC and KPSC members of all ages enrolled between 1 January 2006 and 31 December 2014 to identify two types of evidence supporting transgender status: (1) relevant International Classification of Diseases, Ninth edition (ICD-9) codes and (2) presence of relevant specific keywords in free-text clinical notes (table 1). The program was developed and pilot-tested at KPGA and then distributed to the remaining sites. Cohort ascertainment was undertaken before the health plans switched to ICD-10 codes.

Table 1

ICD-9 codes and keywords used to identify potentially eligible STRONG transgender cohort members among KPGA, KPSC and KPNC members

The diagnostic ICD-9 codes suggestive of transgender status were selected based on consultations with the STRONG Stakeholder Advisory Group and methodologies described in earlier studies.23 39 Transvestic fetishism (ICD-9 code 302.3) was included based on previous observations that men who initially meet criteria for this diagnostic category may later experience persistent gender dysphoria consistent with transgender status.22

We also used ICD-9 V codes, which allow for supplementary classification of factors influencing health status.40 41 As V codes may cover several conditions, we used them in conjunction with internal KP codes to ensure specificity. For example, a combination of ICD-9 code V49.89 and KP code 121141596 means ‘Other conditions influencing health: transgender’.

The second method of transgender ascertainment involved another custom-written program that identified the relevant keywords in free-text clinical notes recognising that both appropriate and inappropriate terms could be found in the EMR. During pilot testing, an expanded list of keywords provided by the stakeholders was used; that list was gradually shortened after stepwise removal of keywords that did not contribute additional cases. The resulting list provided a complete cohort ascertainment with the shortest program running time.

Step 2: Cohort validation

A separate program extracted short strings of text that included 100 characters before and 50 characters after each keyword of interest. When clinical notes contained relevant keywords (with or without an ICD-9 code), transgender status was confirmed through an examination of the deidentified text strings by two trained reviewers. Disagreements among reviewers were adjudicated by a review committee that included two physician investigators (MG and VT) and the project manager (RN).

Cohort candidates with no keywords but at least two different diagnostic codes or the same code on different dates were considered eligible. The validity of this approach was confirmed using unstructured chart review during pilot testing of the study protocol, as described previously.24

Members who had evidence of disorders of sex development (ie, abnormalities of chromosomal, gonadal or anatomic sex42) and those younger than 3 years of age at the index date were excluded.

Step 3: Determination of TF/TM status

Each eligible study participant was categorised as TF or TM using several methods. We used all keyword text strings and ICD-9 codes extracted for step 1 to identify additional words such as ‘male-to-female’, ‘female-to-male’ and gender affirmation V codes (V07.8+12124952 and V07.8+12124310). During the validation of transgender status, the reviewers were also instructed to categorise each eligible person as ‘natal male’, ‘natal female’ or ‘unclear’.

For persons whose TF/TM status was unclear after the initial review and for persons with ICD-9 codes only, another free-text program was developed to search for keywords reflecting natal sex anatomy (eg, ‘testes’ or ‘ovaries’), history of specific procedures (eg, orchiectomy or hysterectomy) or evidence of hormonal therapy (eg, oestrogen or testosterone). The keywords used for assigning TM/TF status are included in table 2. Text strings containing TF-specific and TM-specific keywords were reviewed and adjudicated as discussed above.

Table 2

Keywords used for STRONG transgender cohort natal sex assignment

Gender affirmation treatment status

During the initial STRONG cohort validation (step 2) and natal sex determination (step 3), reviewers were instructed to check a box for ‘Evidence of treatment’ if the text strings provided an indication of receipt or referral for HT, surgery or other relevant procedures (eg, electrolysis). Disagreements were adjudicated as described previously.

In addition to text string reviews, gender affirmation treatment status was determined by linkage with cross-sex hormone prescriptions using national drug codes, as well as ICD-9, ICD-10 and Current Procedure Terminology (CPT) codes reflecting surgeries and other interventions (online supplementary tables 1–6). TF drugs (eg, oestradiol, spironolactone) in a natal male and TM drugs (eg, testosterone) in a natal female were considered as evidence of HT.

Supplementary file 1

As a result of these steps, cohort members were categorised based on evidence of HT and surgical or other gender affirmation procedures. Gender affirmation procedures were categorised as: bottom (eg, vaginoplasty for TF or vaginectomy for TM); top (eg, breast augmentation for TF or mastectomy for TM); interventions to change secondary sex characteristics (eg, electrolysis) or not specified (ie, evidence of surgery in the text only).

Selection of the reference cohort

Up to 10 male and 10 female KP enrollees without evidence of transgender status were matched to each member of the final validated transgender cohort on year of birth (within 5-year groups for adults and 2-year groups for children and adolescents), race/ethnicity, KP site and membership year at the index date. Index date was defined as the date of the first recorded evidence of transgender status in the EMR. For some transgender cohort members, the number of reference males or females was less than 10 due to duplicate matches; however, no transgender participant had less than seven referents of either sex. A cluster ID for each matched group was assigned to allow stratified analyses (eg, by HT type or gender-affirming surgery).

Data integration and follow-up

Patient identification numbers for both the transgender and reference cohorts were linked to multiple data sources to obtain ICD-9 and ICD-10 diagnostic codes for the conditions of primary interest (eg, mental health conditions, cardiovascular diseases and diabetes); disease registries to ascertain incident cancers and HIV diagnoses; psychiatric and behavioural healthcare utilisation; pharmacy records to track mental health treatment and HT receipt over time; and dates and results of laboratory tests including bone scans, blood chemistry analyses, hormone levels and blood cell counts (box). Mortality was ascertained from linkages to death registries. All members of the cohort were assigned a study ID by the programmer at each KP site and no personally identifiable information was included in the aggregated analytic file. To date, cohort follow-up extends through the end of 2016.


Data available for Study of Transition, Outcomes and Gender transgender cohort

Data categories and specific elements

Demographic and membership characteristics

  • Age, sex and race/ethnicity

  • Health plan site

  • Area-based socioeconomic status factors*

  • Enrolment/disenrolment intervals

  • Insurance plan type

General health indicators

  • Height/weight (body mass index)*

  • Smoking status*

  • Comorbidities

Gender affirmation procedures

  • Current Procedure Terminology and/or International Classification of Diseases code

  • Date of procedure

  • History of gender affirmation procedures (clinical notes)

Pharmacy records (hormone therapy, psychiatric medications)

  • Medication prescribed

  • Filled prescription for medication

  • Dose

  • Form

  • Dates of prescription and fill

Visit-associated diagnoses

  • Cardiovascular disease 

  • Diabetes

  • HIV

  • Mental health problems

Cancer diagnoses

  • Stage

  • Site

  • Histology

  • Date of diagnosis

Laboratory results

  • Laboratory test

  • Value

  • Date

Vital status

  • Date of death

  • Cause of death

  • Note: *Assessed at index date (date of first evidence of transgender status in electronic medical record).

Findings to date

Initial review of the EMRs identified 12 457 potential transgender individuals. Of these, 7272 (58%) were identified through keywords only, 1132 (9%) through ICD-9 codes only and the remaining 4053 (33%) had both ICD-9 codes and keywords (figure 2). Among these candidates, 6456 were confirmed as transgender: 10% from ICD-9 codes alone, 29% from keywords alone and 61% from both codes and keywords. Based on validation results, the positive predictive values for keywords, diagnostic codes and both were 26%, 54% and 98%, respectively. The leading reason for non-eligibility was the use of a keyword (eg, transgender) referring not to the patient, but to the patient’s relative or partner. In other situations, the keywords of interest were used as part of standard text, such as when listing indications for hormone use. Natal sex and/or gender identity was successfully determined for all but 89 (1.4%) of the transgender cohort members.

Figure 2

Results of STRONG transgender cohort ascertainment and validation. ICD-9; International Classification of Diseases Ninth edition; STRONG, Study of Transition, Outcomes and Gender; TF, transfeminine; TM, transmasculine. 

The transgender cohort was matched to 127 608 enrollees with no evidence of transgender status. Of those 63 825 were women and 63 783 were men.

Figure 3 displays proportions of transgender enrollees over time at each of the three participating sites. In 2006, the prevalence estimates (95% CIs) per 100 000 enrollees were 3.5 (1.9 to 6.3), 5.5 (4.8 to  6.4) and 17 (16 to 19) in KPGA, KPSC and KPNC, respectively. By 2014, the corresponding estimates increased to 38 (32 to 45) in KPGA, 44 (42 to 46) in KPSC and 75 (72 to 78) in KPNC. The composition of the transgender population has also changed. Whereas in 2006, the TF:TM ratio among newly identified cohort members was approximately 1.7:1, in 2014 the same ratio was 1:1.

Figure 3

Prevalence of transgender status by site and year of health plan enrolment. Dotted lines represent linear trends. KPGA, Kaiser Permanente health plan located in Georgia; KPNC, Kaiser Permanente health plan located in Northern California; KPSC, Kaiser Permanente health plan located in Southern California.

As shown in table 3, about 60% of all participants were from KPNC, 38% were from KPSC and less than 3% were from KPGA. With respect to race and ethnicity, blacks and Asians each comprised about 8% of the study population, 19% were Hispanics and 55% were non-Hispanic whites. Compared with TF, TM subjects were younger (76% vs 53% under the age of 36) and included a greater proportion of subjects who were obese (31% vs 22%). Proportions of smokers, insurance status and area-based measures of education were similar in TM and TF study subjects.

Table 3

Characteristics of the STRONG transgender cohort

Nearly two-thirds of the transgender cohort had some evidence of gender-affirming treatment (table 4). Approximately 55% of all transgender cohort members had evidence of HT received at KP. This proportion was slightly higher (58%) in TF than in TM participants (52%).

Table 4

Gender affirmation status of the STRONG transgender cohort members

About 23% of the transgender cohort had some evidence of gender affirmation surgery. Top surgery receipt at KP was far more common among TM cohort members than among their TF counterparts (12% vs 0.3%). Similar proportions of TM and TF cohorts had genital surgeries (4%–5%) or procedures aimed at altering other secondary sex characteristics (11%).

Tables 5 and 6 present the number of cases for various health outcomes in the STRONG population through the end of 2016. These frequencies should not be interpreted as evidence of increased or decreased risk because they do not account for person-time of follow-up, time ordering of the conditions and transgender status and do not take into consideration exposures to cross-sex hormones or surgical procedures. Nevertheless, these data indicate that both TF and TM cohort members, as well as their corresponding referents, have sufficient numbers of cardiovascular events, mental health conditions, HIV, diabetes and several common cancers to permit meaningful analyses. Additional analyses will include changes in laboratory-based endpoints including polycythemia and bone density. These analyses are beyond the scope of this communication, which is focused on the study methods rather than specific findings. The planned analyses will include comparisons of transgender and reference cohorts as well as within-transgender cohort examination of health status before and after surgery and initiation of HT.

Table 5

Frequency of health outcomes in the STRONG TF cohort relative to matched comparison groups

Table 6

Frequency of health outcomes in the STRONG TM cohort relative to comparison groups

Strengths and limitations

In this communication, we describe STRONG, a health system-based observational study that was designed to examine the health status of transgender people and to evaluate the possible risks and health benefits of various gender-affirming treatments. STRONG aimed to overcome five previously described methodological challenges facing transgender health research.

Sample size and power considerations

Adequate sample size can be feasibly achieved with the use of large well-defined populations that offer an adequate sampling frame.38 In practical terms, at least in the USA, this can be done by basing the study in large integrated health systems with millions of members and comprehensive EMRs. The EMR data from the health systems allow assembling cohorts of hard-to-reach populations and ample options for selection of referent groups. The STRONG cohort, which included almost 6500 transgender people and nearly 130 000 referents, represents one of the largest studies of its kind available to date. Nevertheless, a number of important analyses by different subtypes of gender affirmation treatment may not be feasible due to sparse stratum-specific data.

Systematic identification of eligible study participants

We demonstrated that by using a relatively simple algorithm—based on standard codes and supplemented with analysis of digitised provider notes—it is possible to comprehensively identify transgender enrollees of large community-based health plans. The use of keyword-containing text strings enhanced cohort ascertainment relative to ICD code-alone-based approaches. On the other hand, reliance on keywords without text validation would have erroneously included a substantial number of persons who are not transgender. A review of records to confirm transgender status added considerable time and resources; however, it is still more efficient and more comprehensive than the traditional unstructured chart review. In conducting cohort validation, we reviewed up to three clinical note excerpts on 11 325 people. This task was accomplished within 6 months.

A comprehensive identification of all transgender people in the KP population (with and without evidence in the medical records) would require contacting more than 8 million members to inquire about their natal sex and gender identity; this does not appear feasible at this time.

Determination of natal sex and/or gender identity

Our study definitively ascertained natal sex and/or gender identity for nearly 99% of cohort members. We obtained information on TM/TF status from three sources: keyword text strings, pharmacy records and procedure codes. In most instances, these sources were in agreement; however, in some cases the results were discordant. Each disagreement was used as an opportunity to check data accuracy and allowed reducing misclassification, which would have been substantial if STRONG relied on demographic data in the EMR. For example, among adult TF study subjects, 41% were documented as ‘female’ and 59% were documented as ‘male’. The corresponding proportions of people classified as ‘female’ and ‘male’ among adult TM cohort members were 59% and 40%, respectively, with 1% recorded as unknown in the EMR. By contrast, in 96% of persons under the age of 18, the demographic variable reflected natal sex.

A limitation of the current data is the inability to accurately identify persons who reject binary gender categories. These individuals are likely to be found among cohort members who do not have a transgender-specific diagnosis and receive no HT or surgical treatment; however, at the present time EMRs alone are not sufficient for determination of non-binary gender identity. This will be possible in the near future after KP introduces new data capture systems with separate fields for natal sex, gender identity, preferred pronouns, organ inventory and history of gender-affirming procedures.

Assessment of current and past gender affirmation treatment

Although the information on gender affirmation received within the KP system is high quality, one of the main limitations of STRONG data is the lack of information on HT and surgical treatment received outside the KP system. This restricts our ability to identify a subcategory of transgender cohort members with no history of gender affirmation treatment of any kind. For this reason, the most definitive analyses are limited to people who initiated therapy at KP. These individuals can be identified among those STRONG participants whose EMR demonstrates a gap between index date and the first prescription for HT. This ‘HT initiation’ group represents about 35% of TM and 32% of TF subjects.

The broadening of coverage for gender affirmation services at KP occurred relatively recently. As the proportion of transgender people among enrollees has been increasing and many patients now initiate and receive gender affirmation therapy exclusively within the system, it is important to both expand the cohort and extend the follow-up of current participants.

Engagement of patient and physician stakeholders

A critical feature of STRONG is patient-centredness. During the study implementation, we held monthly stakeholder calls and had three inperson Stakeholder Advisory Group meetings. These interactions had direct impact on study design and implementation. For example, the list of keywords used for STRONG cohort ascertainment was proposed, pilot-tested and refined in close consultation with the study stakeholders. Following advice from stakeholders, we expanded eligibility criteria to include transgender and gender non-conforming youth (persons under 18 years of age). This change allowed a number of additional analyses and offers important opportunities for future follow-up. STRONG stakeholders also helped to develop a comprehensive list of hormonal medications and procedures used for gender affirmation. The publications describing our formative research,24 43–45 the current communication and the reports in preparation, all include stakeholders as coauthors.


Although the body of literature addressing transgender health issues has been growing,46 most studies focus on substance use, sexual health, sexually transmitted infections and, to a lesser extent, mental health conditions.47 By contrast, limited data are available on general health status, or the incidence of chronic age-related conditions including cardiovascular disease, endocrine disorders and cancer.

To date, most data on morbidity and mortality in transgender populations come from clinical centres in Europe.19 48–50 These studies are characterised by detailed clinical data; however, they are limited by relatively small sample sizes.

In terms of overall design and size, STRONG is comparable to US-based studies that used the Veterans Health Administration (VHA) data.22 51 Unlike our study, however, the VHA data did not distinguish between TF and TM subjects and were limited to transgender persons identified via ICD-9 codes, without keywords.

We recognise that transgender people enrolled through an integrated healthcare system will yield a cohort of persons with health insurance. Weighing against this concern is the demonstrated ability to cost-effectively identify a large cohort of transgender subjects and referents with high degree of internal validity. The availability of a well-defined underlying population with detailed EMR ensures that participation does not require subject opt-in and allows selecting a complete cohort (rather than a sample) of eligible subjects. Moreover, as KP now provides ‘one-stop’ delivery of transgender care, the likelihood of capturing full details of gender affirmation treatment is increased. These internal validity advantages weigh heavily as a counter against concerns about representativeness.

In summary, STRONG is well positioned to fill existing knowledge gaps and make important contributions to the current literature. Lessons learnt while conducting this project provide support to future transgender health-related research. The methodology can be implemented at other healthcare institutions with EMRs, particularly in organisations participating in the Healthcare Systems Research Network (total population of almost 20 million) with little site-specific customisation. With extended follow-up and expanded cohort size, the data will permit additional analyses of rare health endpoints across various categories of surgical procedures, and different HT formulations, routes of administration and doses.


The authors acknowledge Kimberly L. Cannavale, MPH, and Alexander S Carruth who performed additional validations of eligibility and ascertainment of natal sex via reviews of medical records.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.
  36. 36.
  37. 37.
  38. 38.
  39. 39.
  40. 40.
  41. 41.
  42. 42.
  43. 43.
  44. 44.
  45. 45.
  46. 46.
  47. 47.
  48. 48.
  49. 49.
  50. 50.
  51. 51.


  • Contributors VPQ and MG prepared the original draft of the manuscript. RN conducted data analyses and put together tables and figures. RC and LC were responsible for the preparation and application of data collection programs and ascertainment of study variables. EH, VPQ, DG, MJS and DT led study implementation at participating sites and were actively involved in study planning and design. TAB-C, AM and BR were responsible for the day-to-day project management at each site. DR and TLL provided methodological input on various aspects of study design, including identification of sources of bias and ways of addressing threats to validity. VT and JS provided input on clinical aspects of the study and were responsible for adjudication of study eligibility, gender identity and gender affirmation. KW developed, pilot-tested and supervised the cohort validation and gender identity ascertainment review protocols and programming. SG, CV and SW are members of the Stakeholder Advisory Group who assisted with the development of inclusion and exclusion criteria, assessment of gender identity and interpretation of the data. All authors provided critical review of the manuscript for important intellectual content and approved the final version.

  • Funding This research was supported by the Contract AD-12-11-4532 from the Patient Centered Outcome Research Institute and by the Grant R21HD076387 from the Eunice Kennedy Shriver National Institute of Child Health and Human Development.

  • Competing interests None declared.

  • Ethics approval Emory University IRB.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement Once the initial data analyses are complete, we will be open to collaborations with outside investigators as permitted by the IRBs of participating sites as well as by local, state and federal laws and regulations. In particular, we will encourage collaborations with researchers whose expertise is under-represented on our research team. To become a collaborator, a researcher will be required to submit an application, which will undergo both a scientific and an IRB review. In view of the complexity of the database, interested investigators will be asked to form a collaborative arrangement with the STRONG investigators rather than simply receive the data themselves. No additional data are available.