Create a record of results that students can reproduce

Earlier this month, a large-scale replication project highlighted how difficult it is to repeat the results in published papers. During my three decades at the head of a fundamental biology laboratory, I have experienced the same thing. One of my biggest frustrations as a scientist is that it’s so hard to know what exciting results are strong enough to build on. As a mentor to early-career scientists, I try to hone their skills to spot what is unlikely to be replicable, such as articles published with oddly cropped images or protocols that don’t mention any replication. Yet my students cumulatively squandered decades of research to obtain results that were impossible to confirm. The virtuous circle of progress is broken.

Countless steps have been taken to address this problem: better reporting, better career incentives, the separation of exploratory work from confirmatory work, and the development of an infrastructure for large collaborative confirmatory experiences (OB Amaral and K. Neves Nature 597, 329-331; 2021).

As the year draws to a close, it is natural to consider how to improve in the future. One step would be to explicitly restructure scientific publications so that they fulfill their functions as building blocks of knowledge. Previous suggestions include requiring authors to include statements of generalizability or a numerical confidence level. I propose here two new strategies.

First, each published study must articulate specific testable conclusions.

In my field – cancer biology – an overall conclusion might be that the Y enzyme regulates cell migration in cancer. This could be constructed from a series of experimental results, each presented quantitatively, along with the relevant metrics. It’s easy to imagine a series like this: (a) compound X inhibits Y in vitro, with a KI (a biochemical measure of inhibition) of 300 nM; (b) compound X inhibits Y in cells with an IC50 (concentration giving 50% inhibition) of 1 µM; (c) compound X inhibits cell migration with IC50 of 1 µM; (d) deletion of the gene encoding Y inhibits cell migration by> 50%.

Each statement includes a “testable unit”. These units can be grouped together at the end of the article in a “collection of discrete authenticated observations”, or CODA-O. Authors can append a section in which they list specific testable units in the work of other groups that they have validated during their current work.

The main goal is for authors to articulate and “own” explicit testable statements by expressing extremely high confidence in them. This, in turn, should prompt them to clearly articulate the experimental conditions required, for example indicating whether the results were obtained in one cell line or in several cancer cell lines. It will also clarify how the work might be extended (for example, testing a result in other cell types).

Second, I propose that scientists extract these discrete observations from the literature and compile them into a registry that can initiate experiments to be conducted in undergraduate or undergraduate laboratory courses.

My work focuses on biologically active lipids, their roles in cell signaling pathways and how these pathways go wrong in cancer. In addition to working with numerous post-docs and graduate students, I have trained and mentored approximately 150 undergraduate and high school students. Assigning a team of these trainees a request for the register to be reproduced would be meaningful to them. Such a training program would produce qualified troubleshooting researchers and results that would contribute to science. An approach with similar objectives has been implemented in the field of psychology (K. Button Nature 561, 287; 2018): Graduate students develop a basic protocol and supervise groups of undergraduates performing this protocol.

A registry of replication requests would provide multiple benefits. First, the requirements to specify which components of an experiment will be reproducible would reduce the temptation to exaggerate and generalize the results. And that could prevent some of the bickering that occurs when one group of researchers say they can’t duplicate the “work” of another. Second, researchers might be encouraged to describe their experiences more fully if others are more likely to try to formally replicate the work. Third, trainees would learn to appreciate the experimental nature of the biological sciences (too often undergraduates are taught that all that matters is the existing body of knowledge, not how it is constructed). Fourth, the registry could allow others to verify whether the allegations in a document have been reproduced, and perhaps uncover subtle requirements for making an experiment work. Graduate students and interns working on updating the register would get practical information on what makes the work repeatable.

When too much data in the system is non-reproducible, the noise becomes overwhelming: “garbage in, garbage out”. However, with reliable evidence, synthesis and bioinformatics techniques will be much more productive and predictive.

This idea has limits. Some techniques are too specialized and some experiments too expensive or too resource intensive to replicate. In these cases, a register would still be useful as it would articulate specific units of work and could prompt a targeted replication effort if the results are deemed critical to the community. It would also establish a priority – simply requiring that claims be recorded in a testable form would prompt researchers to point out what would be reproducible under what conditions. That alone would save my lab and others from fruitless work.

Competing interests

The author declares no competing interests.

Comments are closed.