Systematizing Confidence in Open Research and Evidence (SCORE)

Authors : Nazanin Alipourfard, Beatrix Arendt, Daniel M. Benjamin, Noam Benkler, Michael Bishop, Mark Burstein, Martin Bush, James Caverlee, Yiling Chen, Chae Clark, Anna Dreber Almenberg, Tim Errington, Fiona Fidler, Nicholas Fox, Aaron Frank, Hannah Fraser, Scott Friedman, Ben Gelman, James Gentile, C Lee Giles, Michael B Gordon, Reed Gordon-Sarney, Christopher Griffin, Timothy Gulden et al.,

Assessing the credibility of research claims is a central, continuous, and laborious part of the scientific process. Credibility assessment strategies range from expert judgment to aggregating existing evidence to systematic replication efforts.

Such assessments can require substantial time and effort. Research progress could be accelerated if there were rapid, scalable, accurate credibility indicators to guide attention and resource allocation for further assessment.

The SCORE program is creating and validating algorithms to provide confidence scores for research claims at scale. To investigate the viability of scalable tools, teams are creating: a database of claims from papers in the social and behavioral sciences; expert and machine generated estimates of credibility; and, evidence of reproducibility, robustness, and replicability to validate the estimates.

Beyond the primary research objective, the data and artifacts generated from this program will be openly shared and provide an unprecedented opportunity to examine research credibility and evidence.

URL : Systematizing Confidence in Open Research and Evidence (SCORE)


The Natural Selection of Bad Science

Authors : Paul E. Smaldino, Richard McElreath

Poor research design and data analysis encourage false-positive findings. Such poor methods persist despite perennial calls for improvement, suggesting that they result from something more than just misunderstanding.

The persistence of poor methods results partly from incentives that favor them, leading to the natural selection of bad science. This dynamic requires no conscious strategizing—no deliberate cheating nor loafing—by scientists, only that publication is a principle factor for career advancement.

Some normative methods of analysis have almost certainly been selected to further publication instead of discovery. In order to improve the culture of science, a shift must be made away from correcting misunderstandings and towards rewarding understanding. We support this argument with empirical evidence and computational modeling.

We first present a 60-year meta-analysis of statistical power in the behavioral sciences and show that power has not improved despite repeated demonstrations of the necessity of increasing power.

To demonstrate the logical consequences of structural incentives, we then present a dynamic model of scientific communities in which competing laboratories investigate novel or previously published hypotheses using culturally transmitted research methods.

As in the real world, successful labs produce more “progeny”, such that their methods are more often copied and their students are more likely to start labs of their own.

Selection for high output leads to poorer methods and increasingly high false discovery rates. We additionally show that replication slows but does not stop the process of methodological deterioration. Improving the quality of research requires change at the institutional level.

URL : The Natural Selection of Bad Science