Introduction Sarcoidosis is a multiorgan granulomatous disorder thought to be triggered and influenced by gene–environment interactions. Sarcoidosis affects 45–300/100 000 individuals in the USA and has an increasing mortality rate. The greatest gap in knowledge about sarcoidosis pathobiology is a lack of understanding about the underlying immunological mechanisms driving progressive pulmonary disease. The objective of this study is to define the lung-specific and blood-specific longitudinal changes in the adaptive immune response and their relationship to progressive and non-progressive pulmonary outcomes in patients with recently diagnosed sarcoidosis.
Methods and analysis The BRonchoscopy at Initial sarcoidosis diagnosis Targeting longitudinal Endpoints study is a US-based, NIH-sponsored longitudinal blood and bronchoscopy study. Enrolment will occur over four centres with a target sample size of 80 eligible participants within 18 months of tissue diagnosis. Participants will undergo six study visits over 18 months. In addition to serial measurement of lung function, symptom surveys and chest X-rays, participants will undergo collection of blood and two bronchoscopies with bronchoalveolar lavage separated by 6 months. Freshly processed samples will be stained and flow-sorted for isolation of CD4 +T helper (Th1, Th17.0 and Th17.1) and T regulatory cell immune populations, followed by next-generation RNA sequencing. We will construct bioinformatic tools using this gene expression to define sarcoidosis endotypes that associate with progressive and non-progressive pulmonary disease outcomes and validate the tools using an independent cohort.
Ethics and dissemination The study protocol has been approved by the Institutional Review Boards at National Jewish Hospital (IRB# HS-3118), University of Iowa (IRB# 201801750), Johns Hopkins University (IRB# 00149513) and University of California, San Francisco (IRB# 17-23432). All participants will be required to provide written informed consent. Findings will be disseminated via journal publications, scientific conferences, patient advocacy group online content and social media platforms.
- interstitial lung disease
- respiratory medicine (see thoracic medicine)
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
Strengths and limitations of this study
This is the largest and most geographically diverse longitudinal bronchoscopy study with paired blood analysis performed to date in pulmonary sarcoidosis in the USA.
The study design will use repeated measures of a comprehensive clinical phenotyping strategy over 18 months to ascertain pulmonary outcomes in patients with recently diagnosed disease that will be a rich clinical and biological data resource for studies in pulmonary sarcoidosis.
The experimental design will leverage single cell analysis to define immunophenotypic changes in disease associated CD4 T cell populations over time and associate findings with progressive and non-progressive pulmonary sarcoidosis clinical phenotypes.
The experimental plan will also measure transcriptional profiles of isolated CD4 T cell populations using RNA sequencing to enable construction of novel computational prognostic tools via expression deconvolution and association of findings with progressive and non-progressive pulmonary sarcoidosis.
A potential methodological limitation is the reliance on flow cytometers at each centre for processing of freshly isolated samples requiring significant optimisation.
Sarcoidosis is a multisystem granulomatous disorder that is thought to be triggered and influenced by gene–environment interactions.1 This condition affects at least 45–300/100 000 individuals in the USA2 and has a rising mortality rate.3 Sarcoidosis affects the lungs in more than 90% of cases. The greatest gap in knowledge in sarcoidosis management is a lack of tools to discern which subjects will develop progressive disease vs stable or remitting disease. In addition, advanced therapeutics that are effective are lacking which is linked to a lack of understanding about the pathophysiology driving progressive disease.4 While the environmental triggers are unknown,5–8 the inflammatory response is characterised by a predominance of activation of CD4 +T cells, cytokine production, macrophage activation and granuloma formation.9–18
The immunological determinants of clinical outcomes in sarcoidosis remain poorly understood. Sarcoidosis is traditionally characterised by enhanced production of interferon-gamma (IFNγ) by CD4 +T cells. Importantly, recent novel findings identified that a majority of IFNγ-producing cells in the bronchoalveolar lavage (BAL) from sarcoidosis patients bear a Th17 phenotype and are more properly classified as Th17.1 cells rather than Th1 cells.19 In addition, a robust expansion of IFNγ-producing Th17.1 cell was also identified in a European cohort of newly diagnosed sarcoidosis patients, and an even more striking expansion of interleukin 17-producing Th17 cells in the peripheral blood.20 Mouse models have revealed functional ‘plasticity’ in Th17 cells, demonstrating their ability to ‘transdifferentiate’ to an anti-inflammatory phenotype as defined by their transcriptional profile and regulatory capacity.21 Other studies support that a deficit in regulatory T cell (Treg) capacity may permit disease activity in sarcoidosis.22–25 In other words, disease activity in sarcoidosis may depend, in part, on T cell functional plasticity, resulting from transdifferentiation of Th17 cells into pathogenic Th17.1 cells versus anti-inflammatory Tregs, as shown in vivo.21 As such, our study’s hypothesis states that Th17.1 and Tregs play significant yet opposing roles to determine clinical outcomes in sarcoidosis, and that the role of Th17 cells could transcriptionally reflect a multifunctional (eg, Treg) profile in patients with stable (non-progressive or improving) disease.
The BRonchoscopy at Initial sarcoidosis diagnosis Targeting longitudinal Endpoints (BRITE) study is an NIH (National Institute of Health)-sponsored, multicentre, longitudinal study that will measure changes in the immune response and its relationship to pulmonary outcomes in sarcoidosis patients early in their disease course. This is the first study to investigate longitudinal changes in paired blood and BAL T cell phenotype. The study has three primary goals. In aim 1, we will measure changes in T cell lineage diversity (Th17.1, Th17, Th1 and Treg) that occur longitudinally as sarcoidosis subjects develop either progressive or non-progressive disease. In aim 2, we will define changes in T cell lineages during disease course through identification of genome wide transcriptional profiles of purified Th17.1, Th17, Th1 and Treg cells using low-input RNA-sequencing (RNA-seq). In aim 3, we will construct computational tools from T cell lineage diversity and gene expression data to define and/or predict sarcoidosis endotypes using bioinformatic deconvolution and biostatistical methods. We will then extend the genomic findings from the NIH GRADS study (Genomic Research in Alpha-1 anti-trypsin Deficiency and Sarcoidosis)26 by comparing our genomic T cell signatures with gene expression data from this independent cohort to predict T cell lineages and relate findings to outcome definitions used in that study. These bioinformatic tools can also be applied to other genomic datasets from US and European cohorts.
This study fills a major gap in our current knowledge by focusing on the relationship between longitudinal measures of pulmonary outcomes with T cell lineages using an integrated approach of phenotypic cell-sorting, followed by RNA transcriptional profiling, in addition to the measurement of T cell phenotypes in the blood and BAL compartments. Gene expression data obtained in this study will be deconvoluted to create a pipeline of computational tools that can be applied to other datasets to predict disease phenotypes and clinical course. These tools can also be used to assess clinical responses to emerging immunomodulatory therapies and to identify cellular signalling pathways that could be targeted with existing small molecule inhibitors or biological therapies.
Methods and analysis
This study involves four academic universities with established clinical centres in sarcoidosis care and research: National Jewish Health (NJH), University of Iowa, University of California, San Francisco (UCSF) and Johns Hopkins University. All centres represent a geographic spectrum from Coast to Coast. Each site is responsible for enrolling a similar number of participants (figure 1).
Participants will have been diagnosed within 18 months of enrolment according to criteria endorsed by the American Thoracic Society.27 Additional eligibility criteria for enrolment are presented in box 1. Participants with fibrotic (chest X-ray, CXR stage IV) lung disease will be excluded as this form of lung disease was considered long-standing.
Inclusion and exclusion criteria
A histopathological diagnosis of sarcoidosis according to the American Thoracic Society/European Respiratory Society sarcoidosis statement with the exception of Lofgren’s syndrome which are exempt from a pathological diagnosis.
Diagnosis of sarcoidosis within 18 months of enrolment.
Non-smokers (<10 pack-years and no smoking for 6 months prior to enrolment) including vaping and use of marijuana.
No history of systemic immunosuppression within 2 weeks of enrolment.
Chest X-ray Scadding stage 0, I, II, III (chest X-ray stage 0 requires lung biopsy confirming granulomatous inflammation).
Unable to tolerate study procedures as determined by the site principal investigator.
Systemic immunosuppressive therapy within 2 weeks of enrolment except for inhaled steroids.
Scadding stage IV.
Evidence of active cardiac, neurological or ophthalmic involvement with sarcoidosis and require systemic immunosuppressive therapy.
Diagnosis of beryllium sensitisation and/or disease, common variable immunodeficiency, mycobacterial and/or fungal infection or suspected hypersensitivity pneumonitis.
On anticoagulation except for aspirin.
Active bacterial or viral infection, use of antibiotics or immunisation within 4 weeks of enrolment (may reassess for participation after the 4-week period).
Known medical problems that could affect biological interpretations, including malignancy, autoimmunity, asthma and COPD (Chronic Obstructive Pulmonary Disease), chronic viral infections (hepatitis B, C, Herpes virus requiring suppressive medications, HIV (Human Immunodeficiency Virus)).
History of cancer other than presumed cured non-metastatic skin cancer.
Currently institutionalised (eg, prison, long-term care facility).
Other comorbid conditions that increase risk of complications from bronchoscopy including uncontrolled hypertension and/or diabetes, unstable coronary artery disease or decompensated heart failure, active cardiac arrhythmias.
Study participants will undergo several procedures as depicted in table 1. A history and physical examination will be performed at each study visit to assess disease status and changes in measurements and to confirm safety of performing bronchoscopy with lavage. An anterior–posterior CXR and pulmonary function testing including forced vital capacity (FVC), forced expiratory volume in 1 s (FEV1) and a single breath diffusing capacity for carbon monoxide (DLCO) will be performed at baseline and at 6-month intervals during follow-up. Standard bronchoscopy with lavage will be performed at baseline and 6 months later. At each 6-month study visit, participants will complete a blood draw and three patient-reported outcome measures to numerically quantify dyspnoea (University of California, San Diego shortness of Breath Questionnaire28), quality of life (12-item Short Form Survey (SF-12)29) and fatigue (patient-reported outcomes measurement information system30).
The primary outcome at the completion of the study visits will be progressive pulmonary sarcoidosis. To define this outcome, we will use a composite measure of lung function and radiology measurements. For lung function, we will define progression by declines in lung function using thresholds that have precedent in the interstitial lung disease literature.31 32 Specifically, participants will meet the definition of progressive lung disease if there is a≥10% decline in FVC and/or FEV1 or a ≥15% decline in DLCO. For chest imaging, we will define progression by increasing opacities on chest radiography as determined by the interpreting radiologist/investigator. If a participant meets either the lung function definition or the CXR definition, they will be categorised as progressive lung disease. Participants who do not meet these definitions will be categorised as non-progressive disease. We will exclude other causes that may mimic progressive disease based on clinical presentation, radiographic pattern consistent with sarcoidosis and analysis of BAL cell counts and differentials. BAL cultures will be performed based on clinical indications.
Study visits and timeline are depicted in figure 2. Procedures at each visit are outlined in table 1. At enrolment (V1) and at 6-month intervals (V3, V5, V6), participants will be assessed with the procedures of lung function, CXR and questionnaires during the 18-month follow-up period. The V2 and V4 visits focus on biospecimen collection for bronchoscopy with lavage and venipuncture and will occur within 1 month of V1 and V3, respectively (figure 2). The second bronchoscopy will occur approximately 5–6 months after the first bronchoscopy. Participants will also undergo venipuncture for blood collection at V5 and V6.
Our primary outcome is to determine the distribution of T helper cells and Treg cells in paired blood and BAL samples over time and relate these measurements to clinical outcome (progressive vs non-progressive disease). Based on previous cohort studies from our group,33 approximately 50% of sarcoidosis patients met the definition for progressive disease. Therefore, we assumed 50% of the target enrolment sample (ie, 40 participants out of a total target enrolment of 80 participants) will be progressive. With alpha=0.00625 (0.05/8 for four subsets, two time points), we will have >80% power to detect a 0.82 SD difference in a cell subset percentage between groups; if only 40% progress, we will have >80% power to detect a 0.85 SD difference between groups. For the gene expression analyses, we will have 80% power to detect a 0.95 SD change in expression between progressive and non-progressive participants or an R2 of 0.19 between expression and a quantitative clinical phenotype such as FVC, assuming alpha=0.001.
Data collection methods
Deidentified clinical data will be entered and stored in a central REDCap (Research Electronic Data Capture) database (https://www.project-redcap.org) that was created for the BRITE study and hosted by UCSF. Data instruments include demographic information, organ involvement using a modified organ assessment tool,34 medical history, medications, questionnaire responses, pulmonary function measurements, complete blood counts and details related to the biospecimen collection of blood and BAL. The organ assessment tool was developed by Delphi study methodology34 and is a well-accepted survey to quantify organ involvement by sarcoidosis.26 The questionnaires will be completed by participants on paper or through a REDCap survey. The dyspnoea and fatigue questionnaires have been validated in patients with interstitial lung disease28 35 36 and sarcoidosis,30 37 respectively. The SF-12 instrument is an extensively used quality of life assessment tool that is scored into a Mental Component Summary score and a Physical Component Summary score.29 Bronchoscopy with lavage will be performed in two subsegments using up to a total of 480 mL sterile saline. Prebronchoscopy and postbronchoscopy assessment and monitoring will be performed as per each institution’s standard of care. Each centre will follow recommended ATS/ERS guidelines for test performance for spirometry38 and DLCO.39
A complete blood count will be collected at visits 2, 4, 5, 6 (figure 2). Serum will be collected at visits 2, 4, 6. PMBC will be collected at visits 2 and 4 and processed for flow cytometric and gene expression assays. PBMC (Peripheral Blood Mononuclear Cell) will also be collected at visits 5 and 6 and frozen. Excess PBMCs at visits 2 and 4 will also be frozen. Whole blood will be collected into DNA PAXgene tubes and frozen at visit 5 and will be collected and frozen into RNA PAXgene tubes at visits 2 and 6. All frozen samples will be stored for future use as specified in the IRB approvals at each site. Cells isolated from PBMC and BAL samples will undergo multiparameter flow cytometry with sorting of Treg, Th1, Th17.0 and Th17.1 populations. Sorted cells will be lysed with appropriate RNase inactivating buffers and kept frozen at −80°C until processed. Vials of lysed cells will be shipped to NJH for RNA isolation and RNA-seq next-generation sequencing of each sorted population.
The data analyst will perform data validation in accordance with the study-wide protocol specifications. Quarterly data check programmes will be run to identify discrepancies in entered data. Study sites will be notified about data discrepancies and query resolution will be performed and the database updated as appropriate. Discrepancies to be flagged will include inconsistent data, missing data, range checks and deviations from the protocol. There will be a final data validation check of the study-wide dataset and the database will be locked after approval from all investigators.
Our primary outcome will be to determine the distribution of T helper cells and Treg cells at diagnosis and follow-up in paired blood and BAL samples and relate these measurements to clinical course (progressive vs non-progressive disease). The primary statistical analyses will use t-tests or rank sum tests, as appropriate, to compare the percentage of each subset (eg, Th17.1) between participants with progressive and non-progressive disease. The primary analysis will use the enrolment distribution to predict progressive vs non-progressive disease at follow-up; secondary analyses will consider the change in a subset frequency between enrolment and follow-up. Exploratory analyses will test for association between the cell subset percentages and measures of disease severity (FEV1, FEV1/FVC, DLCO).
Association between expression and clinical outcomes: To determine whether individual genes or combinations of genes are associated with clinical outcomes, a series of analyses will be computed using gene expression from BAL and PBMC samples. The main clinical outcomes are as defined in aim 1 (progressive vs non-progressive). To leverage the ability to infer more than cross-sectional correlation, primary analyses will test whether gene expression at enrolment predicts the primary clinical outcomes at follow-up. Secondary analyses will test whether significant gene expression changes between enrolment and follow-up identified above are associated with the clinical outcomes at follow-up. Primary inference regarding disease aetiology will be based on gene expression from BAL samples among all sarcoidosis cases. To identify potential gene expression markers of our primary outcome, PBMC will be used to identify genes whose expression is correlated with expression of genes in BAL that predict the clinical outcomes. Exploratory analyses will consider other time point combinations and solely use PBMC gene expression to predict clinical outcomes.
Differential expression between dichotomous groups (eg, progressive vs non-progressive): this will be identified via negative binomial regression using the software DESeq2,40 adjusting for age and sex. We will also use CIBERSORTx to deconvolve RNA-seq data to predict both cell-type proportions41 (which we can validate against our flow analysis results) and to compute cell-type-specific gene expression for the cell types we can reliably separate from bulk. Association between continuous outcomes (eg, FVC) and read count will be tested via regression models with the clinical outcome as the independent variable and read count as a predictor, also adjusting for age and sex. We will perform these tests using read counts for all genes at enrolment for the primary analyses; the secondary analyses will use change in normalised read counts from enrolment to follow-up for only those genes differentially expressed overall between enrolment and follow-up. After the progressive versus non-progressive disease differential expression analysis, we will use the significantly associated genes to perform hierarchical clustering to determine the ability of the combination of genes to distinguish between progressive and non-progressive cases. We will perform similar analyses as described above in terms of identifying pathways involved and overlap among the significantly associated genes for all the clinical outcomes measured in this study.
For aim 3, concordance between known and predicted cell-type proportions or cell-type-specific expression values will be determined by Pearson correlation coefficient and root mean square error to measure linear fit and estimation bias, respectively.42 Empirical p values are generated to test the null hypothesis that no cell types in the signature matrix are present in a given mixture using a Monte Carlo sampling strategy for randomised mixture samples.42 A similar strategy will be used to estimate p values for the cell-type-specific expression data.
Clinical outcomes: Similar to aim 2, we will first test for whether gene expression at specific genes or combinations of genes is associated with clinical outcomes. In this aim, we will be able to use the cell-type-specific expression data to test these associations. We will use the same approach in terms of primary analyses (expression in BAL at enrolment predicting clinical outcomes at follow-up) and secondary analyses (changes in BAL expression between enrolment and follow-up predicting clinical outcomes at follow-up). We anticipate choosing one or two cell-specific subset expression profiles rather than testing all subsets; we will use the results from aim 2 and aim 3 to determine the cell types that appear most likely to be driving any clinical associations. Exploratory analyses will evaluate all the cell subsets and PBMC expression. We will first test individual genes within each subset but will also consider using principal components analysis to reduce dimensionality in addition to other refinement of the list of genes to test based on our results in aim 2. If we test all genes, we will use a false discovery rate of ≤0.001, similar to aim 2. If we have a smaller subset of genes to test in primary analyses, we will use a Bonferroni correction to determine the type I error rate. The constructed computational tool and identified endotypes from the BRITE study cohort will be tested with an independent cohort of samples (ie, the GRADS datasets).
Recruitment began in February 2020 with a projected completion of study visits by February 2023.
Patient and public involvement
It was not appropriate or possible to involve patients or the public in the design, or conduct, or reporting, or dissemination plans of our research.
Ethics and dissemination
This multicentre study is being conducted in accordance with globally accepted standards of good practice, in agreement with the Declaration of Helsinki and with local regulations. The study protocol has been approved by the Institutional Review Boards at National Jewish Hospital (IRB# HS-3118), University of Iowa (IRB# 201801750), Johns Hopkins University (IRB# 00149513), and University of California, San Francisco (IRB# 17-23432). All participants will be required to provide written informed consent. All manuscripts resulting from the study will be submitted to peer-reviewed journals. Findings will be disseminated via journal publications, scientific conferences, patient advocacy group online content and social media platforms.
Patient consent for publication
This work has not been previously presented at a conference or published as a conference abstract. We would like to acknowledge Kristyn MacPhail for her contributions to the study design.
Contributors All the authors contributed to the conception and design of the study. Drafting of the manuscript was performed by NYH and LLK. Rigorous critiques were provided by LDH, EKW, NKA, LP, BRW, REM, PG, BQB, LL, MG, SEC, JC, ESC, LAM, SML and BPO’C.
Funding This study is funded by the National Heart, Lung and Blood Institute (NHLBI) of the National Institute of Health (NIH), grant number 5R01HL136681.
Competing interests The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.
Provenance and peer review Not commissioned; peer reviewed for ethical and funding approval prior to submission.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.