There is no replication crisis in science. This is the base rate fallacy.
Over the past decade and a half, several reports have drawn attention to a growing crisis in science. Many scientific studies, especially in the life sciences, cannot be replicated. For example, a major reproducibility project which sought to reproduce 193 experiments of 53 leading research papers on cancer biology encountered multiple obstacles. In the end, he was only able to repeat 50 experiments from 23 papers.
Naturally, the replication crisis challenges the foundation of knowledge generation. If we cannot reproduce the results of experiments, how much confidence should scientists or the public have in the majority of scientific research?
It is possible, however, that the replication crisis is non-existent or greatly exaggerated. Here is an overview of two lines of arguments.
Base Rate Error
When there were surges of COVID cases as a result of vaccination efforts, many people on the internet suggested that COVID vaccines were ineffective. One of their main reasons was that we see more cases in vaccinated people than in unvaccinated people.
This is a classic example of the base rate error, which is the tendency to ignore the general prevalence (the base rate) of a phenomenon and instead focus on data relating only to a group or specific situation. If most people are vaccinated, even a smaller fraction of those people could be as large or larger than a much larger fraction of the unvaccinated minority.
Subscribe to get counterintuitive, surprising and impactful stories delivered to your inbox every Thursday
Why is this important for replication in science? British philosopher Alexander Bird argue that the base rate fallacy explains it.
When studying a phenomenon, it will not be surprising if most assumptions are wrong. This may be due to a number of reasons, including a tendency to test bold ideas or because the particular area of study is largely unexplored and challenging (such as cancer biology). A high a priori probability that the hypotheses are false is then consistent with high-quality science. Due to the inherent variability of experiments, some of these erroneous assumptions will be falsely proven.
If false hypotheses constitute a large part of all possible hypotheses, the number of these false proofs can be comparable in number to the few hypotheses that are (correctly) proven correct. This situation is further skewed by the fact that scientists are more likely to report studies where they find something – whether it is really true or not – than when their hypotheses fail. Subsequent experiments that attempt to reproduce these falsely proven hypotheses will fail.
Among science fields, psychology and clinical medicine are often reported to have the lowest reproducibility rates. Unlike physics, our knowledge of biology is still far too incomplete to be able to understand biological systems from first principles. Therefore, assumptions are more likely to be wrong, which explains the low reproducibility of experiments.
Some scientists suggest that the replication crisis is a symptom of systemic challenges. One of the main reasons why studies are not reproducible, they say, is the pressure to publish or perish that researchers often face. Are an alarming number of scientists resorting to unethical practices or shortcuts to achieve hard-to-reproduce publishable results?
In a preprint published on SocArXivthe researchers claim that low reproducibility in a field can exist even if no scientist engages in data tampering or other questionable practices.
The authors, however, disagree with the base rate fallacy argument. Flipping the argument around, they cite how the average newly hired psychology assistant professor has 16 publications. These publications test several hypotheses, few of which show negative results (due to the publishing industry and funding agencies incentivizing positive results). If most assumptions were unlikely to be true, generating such a set of positive results would be unfathomable for most young academics.
To prevent researchers from making assumptions once the results are known (HARKing), clinical or psychological studies are pre-recorded. This means that their assumptions and methods are documented before the studies are carried out. Comparing pre-reported studies to published studies suggests that many assumptions are indeed wrong. However, the difference in the number of false and true hypotheses (among those tested) is not large enough for the base rate error to be a sufficient explanation for the replication crisis.
Using low replication rates to assert that a field produces a large number of incorrect results requires the assumption that effect sizes are fixed. In an experiment that indicates how two parameters are related, the size of the effect indicates the strength of the relationship. However, depending on the context of the experiment, the size of the effect can vary considerably.
The authors constructed a statistical model of publication and replication incorporating variations in effect sizes. Simulations have shown replication rates as low as 50% without incorporating unethical behavior.
Rethinking the replication crisis
It is often claimed that the replication crisis is having a negative impact on the already weakened public perception of science. However, the lack of reproducibility is an inherent characteristic of scientific fields exploring bold ideas.
If a hypothesis is very unlikely to be true, even a positive result means that it is still unlikely to be true. Results overruled by later experiments highlight the self-correcting nature of science.
Those affected by the replication crisis offer a range of potential solutions. However, if the crisis is just a statistical result on the scale of modern science, these solutions may have unintended consequences. For example, reducing significance, which some scientists believe can help, will hurt productivity without improving replication rate.
The authors also suggest that “some reforms may impose disproportionate costs on early-career researchers, especially those whose identities are underrepresented in science.”