Knowledge gap or limitations of published studies | Addressed by this study? | Description of how addressed in our study |
Few studies use commercially available AI systems. | Partly | The AI algorithm used in this study10 underlies a triage product that is FDA-approved and commercially available in the USA. |
Studies have used relatively small datasets, often consisting of mammograms from several hundred women (rarely several thousand). Larger validation datasets are required. | Yes | A large validation dataset including 109 000 women will be used. |
The same or selected subsets of the same datasets were used to train and validate models. Validation using independent, external datasets is required. | Yes | The study dataset is external to and independent from the datasets used to train the algorithm. |
Datasets were commonly enriched with malignant lesions, with studies often selecting images containing suspicious abnormalities. Studies are required in unselected screening populations. | Yes | The study dataset is a consecutive, unselected population drawn from a real world, biennial population-based breast screening programme (BreastScreen WA). The dataset is not enriched with cancers. The prevalence and disease spectrum of screen-detected and interval cancers are representative of population breast screening. |
There is a paucity of studies reporting conventional screening metrics (CDR and recall rate). | Yes | The inclusion of unique, consecutive screening episodes will allow estimation of CDR and recall rate (it is not possible to accurately derive these metrics from case-controlled, cancer-enriched datasets). |
There is limited data on AI versus human interpretation. Future studies should compare AI to radiologists’ performance or report the incremental improvement for AI algorithms in combination with radiologists. | Yes | The comparative accuracy of AI and radiologists will be estimated in terms of AUC-ROC, sensitivity and specificity. Incremental rates of cancer detection and recall will be estimated for double-reading with and without AI. |
There are no studies on women’s or societal perspectives on the acceptability of AI. | No | This is beyond the scope of the present study. A parallel stream of social and ethical research by some of the study investigators will explore the acceptability of AI. |
Future studies should include images from digital breast tomosynthesis, given the rapid adoption of this technology. | No | This is beyond the scope of the present study. Digital breast tomosynthesis is not currently used in Australian publicly funded population breast screening programmes. |
AI, artificial intelligence; AUC-ROC, area under the receiver operating characteristic curve; CDR, cancer detection rate; FDA, Food and Drug Administration.