Rewriting the Rules of Publication with Replication
We must show that science is about independent replication, not prestigious citation.
Look online and you’ll see countless researchers complaining about how research publication is broken, how journals don’t have our best interests at heart, and how so much progress is lost because we’re stuck with paywalls, peer-review and PDFs.
Many hope for change, fewer have ideas for how, and even fewer have the skills to execute. We have one such idea at Scholar and are committed to executing on it. It’s not guaranteed to work, but it’s worth a hell of a shot. I’m going to try and explain why it’s a worthwhile attempt, and why DeSci is essential to its success.
The fundamental constraint holding back innovation in publishing is incentives.
We can build all the tools for reproducibility we want, we can create all the publishing platforms we want, and we can push for preprint adoption all we want, but unless we can incentivize researchers to pursue something other than prestige and citations, we’ll forever be constrained by these archaic measures of quality.
We believe that we can shift incentives by showing that science is about independent replication, not prestigious citation; by making independent replications a useful measure for allocating attention, grants and promotions. DeSci as a community is one of the few places with the boldness and zeal to embrace, integrate and drive this new measure and moral premise to success.
The Name of the Game
If you want to get involved in academic research today, you’re forced to play the game of prestige and citations. It’s the currency of science. You either get published in a prestigious outlet and accumulate citations, or you perish.
Prestige and citations are so widely used because they’re the best proxies for research quality we have. They’re how we measure contributions to science. But if we peek under the hood, we find that they’re far from perfect.
Citations originated in the pre-Internet world of physical journals and papers. Eugene Garfield pushed for their broad adoption as a proxy for research quality in the 50s/60s after retrospectively correlating high citation counts with high quality, high impact research. It was a good measure at the time, but if we did the same retrospective analysis today, we’d find that a lot of highly cited research isn’t actually replicable. Researchers started optimizing for citations, and its pushed them to neglect replicability and publish fake results.
Prestige is no better. It’s rooted in the brand names that journals and institutions have accumulated over decades. Scientists are stuck competing for placements at prestigious institutes with prestigious researchers, to get published in prestigious journals and get coverage on prestigious media outlets. It’s this proliferation of prestige that lets the same geriatric cohort of researchers get all the grant money. It’s preventing a growing community of internet-educated independents and internationals from participating in science, and many young researchers have just become fed up with the game and left academia altogether.
Prestige and citations have become gamed and are no longer aligned with good science.
Why then, are they so deeply embedded into our scientific ecosystem? Why are researchers stuck playing this game, despite all of these issues? Because...
Prestige and Citations are Useful
Despite everything that’s wrong with them, prestige and citations are still the best proxies for research quality we have.
There’s so many scientists, and so much research published, that we need some way to differentiate the good from the bad, to find the signal in the noise. Prestige and citations help us allocate attention, grants and promotions. They’re useful for measuring the quality of the research, and the credibility of the researcher.
It’s important to understand what qualities make them good measures; they’re simple, scalable, easily applicable to any field, and intuitive to understand — even to a layman.
It’s these qualities that make them so useful to researchers, funders and institutions, and it’s this usefulness that ensures researchers are stuck publishing to prestigious journals to progress their careers. There’s no good alternative measure of quality to aim for.
Prestige and citations are also the fundamental reason why we haven’t been able to innovate in publishing. Any new journals we create are stuck playing the game of prestige, and it’s an uphill battle to take on the incumbents. Any new reproducible publishing platforms have muted adoption, because prestige takes precedent over reproducibility. Preprints have blossomed, but we still use prestigious publications for career progression.
As long as they’re used throughout the research ecosystem, prestigious citations will be the name of the game. We’ll forever be subject to their limitations.
So how do we move past them?
Build Something Better
To truly shift researchers away from prestige and citations, we should borrow from the playbook of Bitcoin and build something better, out on the technological frontier.
We need to build a better measure of research quality.
We must create a more useful alternative that researchers, funders and institutions can switch to; a better measure for allocating attention, grants and promotions. Importantly, this alternative must be better on the dimensions that matter. It must be simple, scalable, applicable to all research, and high enough in variance that it signals quality.
Independent replications can get us there.
We’ll explain this in detail, but before we dig into that, we need to cover why it’s only recently become possible to effectively measure it.
Truly Reproducible Research
The key innovation that’s opened the door for us to start measuring independent replications is reproducible research.
Reproducible research is the idea that the key results and figures of the paper should be generated from the code and data, ideally with the click of a button. It lets us ignore PDFs and focus on the real source of truth, the computation underlying the paper.
Code and data is increasingly becoming the true source of research results as science gets automated and computational research becomes the norm for most fields. This is especially true for fields done wholly on a computer, like Machine Learning, but the trend is clear.
Despite this proliferation of code, the publishing industry still treats PDFs as the primary source, and code and data as secondary — if it’s even linked at all.
We can flip that premise.
We can treat the code and data as primary and show that reproducible research is the most effective medium to share research.
Importantly, we can not only do this for new research, but for historical research too. This point will become important later on, but we can go back across our historical scientific record, take the most important research papers, and make it easy to digitally reproduce their results. With the tech-stack of DeSci, we can also store this work on-chain so that it’s permanently open-access, owned by researchers, and outside of the control of any company.
We can create truly reproducible research.
So, now that we see the potential, why is independent replication so much better?
Independent Replication: The Gold Standard for Scientific Rigour
Independent replicability is the hallmark of good research. It’s the differentiating factor between the rigorous science of Newton’s Laws and Maxwell’s Equations, and the weak science of all the non-replicable publications polluting our scientific record.
It was never possible to effectively measure the number of replications, until now.
With reproducible research, scientists can share their code and data so that anyone can reproduce results as long as they have a computer and internet connection. And since it’s all done digitally, it’s straightforward to track how many independently reproduce results.
We can track how many times research has been replicated,1 and—going back to our previous point—we can also do this for our historical scientific record. We can display replications for historical work, and in doing so, expose all of the bogus, prestigious but non-replicable work that's polluting our published literature, and elevate the good quality, truly replicable research.
We can show that science is about independent replication, not prestigious citation.
So if we can build and bootstrap it, why is independent replication a better measure?2
— It’s a necessary corrective force to the replication crisis.
— It’s a better proxy for quality because if people can independently validate and verify results, then that research probably has good grounding in truth.
— It’s a simple number, like citations.
— It’s scalable because almost all research is becoming automated and computational, so we can push for reproducible research to become the norm for all fields.
— It’s intuitive to understand—even to a layman—that good science means that anyone can independently verify something to be true.
— It’s a high variance signal for quality because more impactful research will be replicated more, since there’s greater incentives to verify and critique important work.
Imagine if we optimized for replications instead of citations.
Imagine if we used replications instead of citations for allocating attention, grants and promotions. Imagine if researchers were incentivized to make their work as reproducible as possible, rather than hiding their code and data from critique. How different would science look today?
That’s the vision.
Build towards a future where all research is reproducible, and where we measure the quality of the research and credibility of the researcher by replicability; by bootstrapping replications as a useful metric, and pushing for its adoption in the ecosystem.
In order to make it happen, we need the right moral and technological firepower behind it. DeSci has what it takes.
DeSci Can Make This Vision A Reality
Most see DeSci as a technological movement, but the ideological and moral element is just as—if not more—important.
DeSci is for brokenists, not institutionalists. We believe in building anew, out on the frontier, free from legacy constraints. We’re practical enough to realize that radical change rarely comes from reform, we’re visionary enough to believe in a vastly better future, and we’re bold enough to try to make that vision a reality.
It’s this narrative of building anew that’s essential for us to succeed. We must believe that we can bootstrap a new ecosystem on a new moral premise from scratch, and ultimately create something better than the old.
DeSci has the zeal to reject the old ways of doing things and be the early adopters of this new measure of science. We have the projects, protocols and funders that can embrace and rally behind this premise. The same moral premise that Balaji Srinivasan has been advocating for us to embed into our identity and take to its logical limit, the premise that:
Science is about independent replication, not prestigious citation.
Reproduction is not replication, but as robotics and automation proliferates across all research disciplines, most research will be probably done wholly on a computer, and so the code will be able to replicate increasingly larger portions of the research.
There could also be different tiers of replication; e.g. pay X to have research reproduced digitally on servers, or pay Y to have automated pipetting robots replicate the experiment, collect data, and then reproduce results on servers.
There will have to be some Sybil resistance mechanisms to prevent a researcher from spamming replications on their work, although this is a tractable problem.
Obviously, independent replications are not perfect, but let’s not kid ourselves into thinking that citations are perfect either. It’s rare that we ever reach the platonic ideal, and independent replication has a good shot at being a better measure and a necessary corrective force to the current state of scientific research.
I like the emphasis on assigning status more to replications than original discoveries. I hope it can be done, I don't know how much behavioral change is required and on who's part. Is it a failing of human nature that we fixate on the discoverers of something rather than those who verified it?