More information about text formats
Readers of our article have indicated that the description of Figure 1 was insufficient to help them understand what it represents. Below, we further describe the figure, correct two minor errors, and provide a summary table.
Figure 1 description:
Figure 1 was designed to show two overarching pieces of information. First, the line plots allow the reader to see how the filtering process progresses for individual studies. Each line represents a single study going from one phase to the next. Each phase is centered on the midpoint of the range, giving it a Buchner funnel-like shape. The purpose of this representation, as opposed to having the x-axis anchored to 0 for instance, is that it better highlights the rank changes across phases. Just because a study has a large number of papers at one phase does not mean that it will have the largest number included at a subsequent phase, and this difference can sometimes be pronounced (as evidenced by steeply crossing lines).The second piece of information was summary statistics represented in boxplots overlaid at each phase, which are more self-explanatory.
Figure 1 data processing:
For the lineplots, four studies were excluded because their values were greater than 2.5 standard deviations above the mean for the ‘Total N Found’ variable. Had these four studies been plotted, all other lines would have been compressed to the left of the figure because of how far these four values would have extended the axis...
Figure 1 data processing:
For the lineplots, four studies were excluded because their values were greater than 2.5 standard deviations above the mean for the ‘Total N Found’ variable. Had these four studies been plotted, all other lines would have been compressed to the left of the figure because of how far these four values would have extended the axis to the right. These four studies were linewise deleted from the dataset for plotting, meaning they studies were not represented in subsequent phases. In addition, any study missing values for Total N Found were linewise excluded. Thus data are included in the lineplots for 186, 184, 183, and 186 studies, respectively for each phase.
For the boxplots, the values represent all available data, including those values excluded for the lineplots. The truncation for the lineplots was meant purely for a graphical representation of individual studies’ progress through the phases, while the boxplots are meant to summarize the data. Summary data for these boxplots can be found in the table below.
Min 1st Q Median Mean 3rd Q Max NAs N
Total N Found 27 559.2 1781.0 4851.3 4759.8 92022 7 190
Titles & Abstracts 14 355.0 1267.5 3669.7 3874.0 77914 5 192
Full Paper 0 28.8 62.5 179.1 151.0 4385 5 192
Total N Included 0 8.0 15.0 25.6 25.2 291 1 196
Note that in Figure 1, the maxima for phases I and II are listed as 92,020 and 77,910, which differs from the table. The software used to make the data summaries, R, has a default digit value of 4, which thus rounded these two maxima from their true values of 92,022 and 77,914.