An empirical investigation into the number of subjects required for an event-related fMRI study
Introduction
The number of subjects scanned in an fMRI study is very often dictated by practical constraints such as access to scanning time and costs. Under these conditions, an investigator must make a trade-off between the number of subjects to scan and the length of the experiment. Even though these decisions are made frequently, little is known about how many trials, scans or subjects are needed to yield reliable results.
Previous research addressing these issues has shown that the spatial extent of BOLD signal activation maps increases as the number of single trials averaged increases (Huettel and McCarthy, 2001). These authors have demonstrated that at an average of 50 trials (a typical number of trials in an fMRI study), even though the haemodynamic shape was stable, only 50% of the eventually activated voxels were deemed significant. The volume of the activation maps only reached asymptotic values after 150 trials were averaged. Similarly, for block design studies, it has been shown that when averaging across progressively increasing numbers of scans (where a scan, in this case, is defined as a time series of 100 volumes obtained during one 200 s stimulus presentation period: 20 s ON, 20 s OFF, etc.), the spatial extent of the activated voxels increased monotonically and failed to asymptote with as many as 22 scans (Saad et al., 2003).
Practically, it could be very difficult to obtain the required number of trials and scans as dictated by the above studies for each subject. This could also be highly dependent on the type of study involved. For example, a GO/NOGO study needs to develop a prepotency to respond, and thus the trials of interest (NOGOs), by design, must be infrequent. Under these circumstances, the number of trials will be dictated by the length of time the subject can comfortable remain in the scanner while maintaining their ability to perform the task. In this case, to increase the power and thus the reliability of the study, one viable option is to increase the number of subjects scanned. This, in turn, leads one to ask how many subjects are necessary to obtain a reliable group activation map.
To our knowledge, very few published studies have addressed this question. The first such paper (Friston et al., 1999) showed that conjunction analysis with a fixed-effect model is sufficient to make inferences about characteristics that are typical of populations. Using this method can reduce the number of subjects needed to infer differences between populations that are normally required using a standard random-effects model. Although this method is very useful, it does not give a clear indication of how many subjects are necessary to perform an event-related fMRI study. Desmond and Glover (2002) estimated mean differences and variability between two block conditions with fMRI data. These values were used to generate power curves and an estimation of the number of subjects needed to yield reliable results. For a threshold of P = 0.05, 12 subjects were required to achieve 80% power. At more realistic fMRI thresholds (i.e., after correcting for multiple comparisons), approximately twice as many subjects were required to yield similar power. However, this study addressed statistical power in block design experiments and may not extend to event-related designs.
This paper reports an empirical approach to the question of sample size and statistical reliability. Fifty-eight subjects performing similar event-related GO/NOGO tasks were tested. By varying the number of subjects included in the group activation maps, we were able to derive empirically the stability of these maps for different sample sizes.
Section snippets
Subjects and task design
Fifty-eight right-handed subjects (35 female, mean age: 30, range: 18–46) completed a GO/NOGO task after providing written informed consent. The GO/NOGO task required frequent responses and occasional response inhibitions. Subjects were presented with a serial stream of letters. A response was required for every occurrence of the alternating target letters, X and Y, unless the alternation order was broken. Minor variations in the task were presented to four different groups. Fourteen subjects
“Gold standard” analyses
The results of the power analyses are shown in Fig. 1. It was expected that we would find a “shoulder” in the graph after a certain number of subjects, which would then asymptote to a straight line up to 58 subjects. As can clearly be seen, this did not happen. The best-case scenario was at P = 0.01 where the power only reaches 0.5 after 32 subjects. As the P value became stricter (P = 0.000001), this deteriorated to 0.5 at 50 subjects. It is obvious that these activation maps are severely
Conclusions
When planning an event-related fMRI study, it is important to know how many subjects are required to yield reliable results. This paper attempted to answer that question empirically. Although these results might be applicable to the majority of fMRI researchers investigating cognitive processes (such as inhibition), it is important to note that these results may not translate to studies with a higher signal-to-noise ratio or that suffer smaller intersubject neuroanatomical variability. The
Acknowledgements
Supported in part by USPHS grants DA14100, GCRC M01 RR00058 and by the Irish Research Council for Humanities and Social Sciences.
References (13)
AFNI: software for analysis and visualization of functional magnetic resonance neuroimages
Comput. Biomed. Res.
(1996)- et al.
Estimating sample size in functional MRI (fMRI) neuroimaging studies: statistical power analyses
J. Neurosci. Methods
(2002) - et al.
How many subjects constitute a study?
NeuroImage
(1999) - et al.
Dissociable executive functions in the dynamic control of behavior: inhibition, error detection, and correction
NeuroImage
(2002) - et al.
A midline dissociation between error-processing and response-conflict monitoring
NeuroImage
(2003) - et al.
Artifactual fMRI group and condition differences driven by performance confounds
NeuroImage
(2004)
Cited by (125)
An fMRI meta-analysis of the role of the striatum in everyday-life vs laboratory-developed habits
2022, Neuroscience and Biobehavioral ReviewsConsiderations of power and sample size in rehabilitation research
2020, International Journal of PsychophysiologyAltered interhemispheric functional connectivity in patients with obsessive-compulsive disorder and its potential in therapeutic response prediction
2024, Journal of Neuroscience Research