More information about text formats
The problem is the method used in practice for the selection of variables to be included in logistic regression and Cox models in observational medical studies.
The motivation comes from the authors’ work as statistical consultants. Many medical researchers had the idea that only variables which were individually significant should be included in the fitted model. This is in contrast to the correct procedure in which the model should contain variables that are jointly significant. To find these models requires fitting several models and selecting the best, rather than fitting just one. An example is in Table 1 below.
The paper presents the results of a survey in which the frequency of an incorrect method of variable selection was measured as a function of the assessed statistical expertise of the authors of the papers: first author, any other author or none. The expertise was based on the authors’ departmental affiliations. It was found that the frequency of correct variable selection increased with the statistical qualifications of the authors. Clinical trials, as opposed to observational studies, were not included.
The authors also consider how the situation might be improved. A breakdown of the results by country from papers in which the first author is not an expert shows North America and Northern Europe show relatively high expert involvement compared with East Asia, which have a lower involvement. Taiwan is an exception. In the authors’ own cou...
The authors also consider how the situation might be improved. A breakdown of the results by country from papers in which the first author is not an expert shows North America and Northern Europe show relatively high expert involvement compared with East Asia, which have a lower involvement. Taiwan is an exception. In the authors’ own country of Japan the education of biostatisticians is developing rapidly. However, it will take time to develop well-trained experts and this is only in one country. The authors suggest that data be made available and analyzed as part of the peer review process. Such suggestions have been made before, for example in power calculations in grant proposals. It would be excellent if some such system could be made to work. Unfortunately, statistical referees are busy and the standard of reviewing of statistical papers seems to be deteriorating not improving.
In their Supplementary Table 3 the authors present an example of a logistic regression analysis which illustrates the consequences of the incorrect procedure using inference on one factor at a time. The results of the proper analysis are also given. I summarise this analysis in Table 1. There are three factors:
• A: adjuvant chemotherapy
• L: Lymph node metastasis
• B: Biomarker positive
The three left-hand columns of the table indicate the factors included in the model and the right-hand columns indicate the significant variables. Seven models are fitted. The first three rows show the results of fitting the three one-factor models: L and B are individually significant, but not A. For the three two-factor models, L and B and A and L are both significant, but when A and B are fitted, only B is significant. However, as the last row of the table shows, when all three factors are included, all are significant.
Table 1, available at https://bit.ly/2rIucUh: Significance of factors when seven different models are fitted to logistic regression analysis of hypothetical data on recurrence of cancer after surgery.
The example shows that all three variables need to be fitted in order to obtain the best model. In other examples only some of the factors may be required. But several models have to be fitted to determine which best describes the data. Such tables can be amplified by using one or more asterisks to express significance levels and by adding an extra column for some overall measure of fit.
In their discussion section the authors also mention problems that arise with few data and several factors. In this case again several models should be assessed, although it will not be possible to fit a full model like that in the last row of Table 1. However a similar table may be helpful in assessing the properties of the various fitted models, perhaps augmented by a measure of model adequacy such as the information criterion AIC.