Harmonization of multi-site diffusion tensor imaging data
Introduction
Diffusion tensor imaging (DTI) is a well-established magnetic resonance imaging (MRI) technique for studying the white matter (WM) organization and tissue characteristics of the brain. Diffusion tensor imaging has been used extensively to study both brain development and pathology; see Alexander et al. (2007) for a review of DTI and several of its applications. In studies assessing white matter tissue characteristics, two commonly reported complementary scalar maps are the mean diffusivity (MD), which assesses the degree to which water diffuses at each location, and fractional anisotropy (FA), which measures the coherence of this diffusion in one particular direction. Together, MD and FA provide complementary description of white matter microstructure.
With the increasing number of publicly availably neuroimaging databases, a crucial goal is to combine large-scale imaging studies to increase the power of statistical analyses to test common biological hypothesis. For instance, for life-span studies, combining data across sites and age ranges is essential for obtaining the necessary number of participants of each age. The success of combining multi-site imaging data depends critically on the comparability of the images across sites. As with other imaging modalities, DTI images are subject to technical variability across scans, including heterogeneity in the imaging protocol, variations in the scanning parameters and differences in the scanner manufacturers (Zhu et al., 2009, Zhu et al., 2011). Among others, the reliability of FA and MD maps have been shown to be affected by angular and spatial resolution (Zhan et al., 2010, Alexander et al., 2001, Kim et al., 2006), the number of diffusion weighting directions (Giannelli et al., 2009), the number of gradient sampling orientations (Jones, 2004), the number of b-values (Correia et al., 2009), and the b-values themselves.
In the design of multi-site studies, defining a standardized DTI protocol is a first step towards reducing inter-scanner variability. However, even in the presence of a standardized protocol, systematic differences between scanner manufacturers, field strength and other scanner characteristics will systematically affect the DTI images and induce inter-scanner variation. Image-based meta analysis (IBMA) techniques, reviewed in Salimi-Khorshidi et al. (2009), are common methods for combining results from multi-site studies with the goal of testing a statistical hypothesis. IBMA methods circumvent the need of harmonizing images across sites by performing site-specific statistical analyses and combining results afterwards. Fisher's p-value combining method and Stouffer's z-transformation test, applied to z or t-maps, are two common IBMA techniques. Fixed-effect models based on (possibly) normalized images, and mixed-effect models to model the inter- and intra-site variability, are other common techniques for the analysis of multi-site data. Indeed, meta-analysis methods have shown great promise for studies with a large number of participants at each site. For instance, the ENIGMA-DTI working group has been successfully using and validating meta-analysis techniques on such multi-site DTI data (Jahanshad et al., 2013, Kochunov et al., 2014).
Meta-analysis techniques have several limitations, however. First, study-specific samples might not be sufficient to estimate the true biological variability in the population (Mirzaalian et al., 2016). As described by De Wit et al. (2014), adjusting for variability at the participant level is problematic in meta-analyses, since only group-level demographic and clinical information is available. Another limitation is that for a multi-site study, computing site-specific summary statistics will be affected by unbalanced data. For instance, the calculation of a variance using unbalanced datasets is highly affected by the ratio cases/controls in the sample (Linn et al., 2016b). Another limitation, for imaging studies with small sample sizes, the parameters of the z-score transformations cannot be robustly estimated, yielding suboptimal statistical inferences.
Mega-analyses, in which the imaging data are combined before performing statistical inferences, have the potential to increase power compared to meta-analyses (De Wit et al., 2014). In addition, pooling imaging data across studies has the benefit of enriching the clinical picture of the sample by increasing the variability in symptom profiles (Turner, 2014) and demographic variables. This is particularly important for age-span studies. However, pooling data across studies may increase the heterogeneity of the imaging measurements by introducing undesirable variability caused by differences in scanner protocols. Harmonization of the pooled data is therefore necessary to ensure the success of mega-analyses. The DTI harmonization technique proposed in Mirzaalian et al. (2016) is a first step towards that direction. The method is based on rotation invariant spherical harmonics (RISH) and combines the unprocessed DTI images across scanners. Unfortunately, a major drawback of the method is that it requires DTI data to have similar acquisition parameters across sites, an assumption often infeasible in multi-site observational analyses.
In this work, we adapted and compared several statistical approaches for the harmonization of DTI studies that were previously developed for other data types: Functional normalization (Fortin et al., 2014), RAVEL (Fortin et al., 2016a), Surrogate variable analysis (SVA) (Leek and Storey, 2007) and ComBat (Johnson et al., 2007), a popular batch adjustment method developed for genomics data. We also include a simple method that globally rescales the data for each site using a z-score transformation map common to all features, which we refer to as “global scaling”. For the evaluation of the different harmonization techniques, we use DTI data acquired as a part of two large imaging studies ((Satterthwaite et al., 2014) and (Ghanbari et al., 2014)) with images acquired on different scanners, using different imaging protocols. The participants are teenagers, and were matched across studies for age, gender, ethnicity, and handedness.
We first analyze site-related differences in the FA, MD, radial diffusivity (RD) and axial diffusivity (AD) measurements, and show evidence of significant site effects that differ across the brain. This motivates the need for a harmonization technique that is sensitive to region-specific scanner effects. Then, we harmonize the data with several proposed harmonizations, and evaluate their performance using a comprehensive evaluation framework. We show that the ComBat is the most effective harmonization techniques as it removes unwanted variation induced by site, while preserving between-subject biological variability. ComBat is a promising harmonization technique for other imaging modalities since it does not make assumptions about the origin of the site effects.
Section snippets
Data
We consider two DTI studies from two different scanners. To investigate the effect of scanner variations on the DTI measurements, we matched the participants for age, gender, ethnicity and handedness, resulting in 105 participants retained in each study for further analysis. The characteristics of each dataset are described below.
Dataset 1 (Site 1): PNC dataset. We selected a subset of the Philadelphia Neurodevelopmental Cohort (PNC) (Satterthwaite et al., 2014), and included 105 healthy
Results
The results are organized as follows. We first show evidence of substantial site effects in the FA and MD maps in Section 3.1, and then show how the different harmonization methods perform at removing those site effects in Section 3.2. In Section 3.3, we discuss the biological variability at each site separately, before and after harmonization and show how site effects affect the number of voxels associations with age. In Section 3.4, we present our experiments for simulating different levels
Discussion
In this work, we investigated the effects of combining DTI studies across sites and scanners on the statistical analyses. We used FA and MD maps from data acquired at two sites with different scanners. We first showed that combining the two studies without proper harmonization led to a decrease in power of detecting voxels associated with age. This confirmed that DTI measurements are highly affected by small changes in the scanner parameters, as those affect the underlying water diffusivity.
Software
All of the postprocessing analysis was performed in the R statistical software (v3.2.0). For SVA and ComBat, reference implementations from the sva package were used (v3.22.0). All figures were generated in R with customized and reproducible scripts, using several functions from the package fslr (Muschelli et al., 2015) (v2.12). We have adapted and implemented the ComBat methodology to imaging data, and the software is available in both R and Matlab on GitHub (//github.com/Jfortin1/ComBatHarmonization
Competing interests
The authors declare that they have no competing interests.
Authors contributions
JPF developed the methodology and analyzed the data. DP, BT and TW processed the data. ME, KR, DR, TS, RCG, REG and RTSc recruited the participants and acquired the data. JPF and RTSh wrote the manuscript. RTSh and RV supervised the work. All authors read and approved the final manuscript.
Funding
The research was supported in part by R01NS085211 and R21NS093349 from the National Institute of Neurological Disorders and Stroke, R01MH092862 and R01MH107703 from the National Institute of Mental Health and R01HD089390 from the National Institute of Child Health and Human Development. The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding agencies.
References (58)
- et al.
Diffusion tensor imaging of the brain
Neurotherapeutics
(2007) - et al.
White matter development during late adolescence in healthy males: a cross-sectional diffusion tensor imaging study
Neuroimage
(2007) - et al.
Longitudinal characterization of white matter maturation during adolescence
Brain Res.
(2010) - et al.
Looking for the optimal dti acquisition scheme given a maximum scan time: are more b-values a waste of time?
Magn. Reson. Imaging
(2009) - et al.
Removing inter-subject technical variability in magnetic resonance imaging studies
NeuroImage
(2016) - et al.
Identifying group discriminative and age regressive sub-networks from dti-based connectivity via a unified framework of non-negative matrix factorization and graph embedding
Med. Image Anal.
(2014) - et al.
Longitudinal changes in grey and white matter during adolescence
Neuroimage
(2010) - et al.
Multi-site genetic analysis of diffusion images and voxelwise heritability analysis: a pilot project of the enigma–dti working group
Neuroimage
(2013) - et al.
A global optimisation method for robust affine registration of brain images
Med. Image Anal.
(2001) - et al.
Improved optimization for the robust and accurate linear registration and motion correction of brain images
Neuroimage
(2002)
Spatial resolution dependence of dti tractography in human occipito-callosal region
Neuroimage
Multi-site study of additive genetic effects on fractional anisotropy of cerebral white matter: comparing meta and megaanalytical approaches for data pooling
NeuroImage
Changes in white matter microstructure in the developing braina longitudinal diffusion tensor imaging study of children from 4 to 11years of age
NeuroImage
Diffusion tensor imaging of white matter tract evolution over the lifespan
Neuroimage
Control-group feature normalization for multivariate pattern analysis of structural mri data using the support vector machine
NeuroImage
Inter-site and inter-scanner diffusion mri data harmonization
NeuroImage
Dramms: deformable registration via attribute matching and mutual-saliency weighting
Med. Image Anal.
Meta-analysis of neuroimaging data: a comparison of image-based and coordinate-based pooling of studies
Neuroimage
Neuroimaging of the philadelphia neurodevelopmental cohort
Neuroimage
Australian imaging biomarkers lifestyle flagship study of ageing, and Alzheimer's disease neuroimaging initiative. Statistical normalization techniques for magnetic resonance imaging
Neuroimage Clin.
Tract-based spatial statistics: voxelwise analysis of multi-subject diffusion data
Neuroimage
Dwi filtering using joint information for dti and hardi
Med. Image Anal.
How does angular resolution affect diffusion imaging measures?
Neuroimage
Quantification of accuracy and precision of multi-center dti measurements: a diffusion phantom and human brain study
Neuroimage
Analysis of partial volume effects in diffusion-tensor mri
Magn. Reson. Med.
White matter development during childhood and adolescence: a cross-sectional diffusion tensor imaging study
Cereb. Cortex
Statistical methods for assessing agreement between two methods of clinical measurement
Lancet
A comparison of normalization methods for high density oligonucleotide array data based on variance and bias
Bioinformatics
Visualizing Data
Cited by (630)
Subcortical-cortical white matter connectivity in adults with autism spectrum disorder and schizophrenia patients
2024, Psychiatry Research - NeuroimagingPutaminal T1/T2-weighted ratio is increased in PSP compared to PD and healthy controls, a multi-cohort study
2024, Parkinsonism and Related Disorders
- 1
Equal contribution.