Adapted from some old writing of mine.
Combining science with standardization, what could be more boring? Why squeeze the last bit of creativity out of an already-rigid institution?
Research involves doing something that’s never been done before, and that makes science hard to codify. But in fields amenable to standardization, it can dramatically accelerate the pace of discovery.
The value of standards
Standardization can address several problems in science such as:
Fraud. P-hacking and manipulation are harder when using agreed-upon methods because there are fewer degrees of freedom. Standards also make it easy to preregister experiments and harder to wiggle out of their requirements.
Replication. Replicating results is easier with a fully specified protocol.
Tacit knowledge. Science is plagued by the problem of tacit knowledge, standards make tacit knowledge public1.
Researcher training. It’s easier to train new researchers to follow a set of instructions, which reduces the requirement for highly-skilled researchers to perform experiments. Lower costs per experiment should lead to more research being done.
Comparing results. Standardization also ensures that results are easily compared across publications. Subtle differences in technique can distort results and make it hard to draw general conclusions. By reporting results based on common standards, researchers can easily review an entire literature by examining a few key metrics2.
Frictionless Reproducibility for experiments
These benefits combine to accelerate the rate of innovation. A larger volume of research and easy comparison of results can make science more amenable to data analysis. Machine learning algorithms can be used to make new discoveries or suggest new research ideas.
Standards also pave the way for efficient and automated research. Rigid techniques are more amenable to automation, if a standardized experiment becomes the industry standard, equipment can be designed to run it automatically. This further accelerates the research process.
Taken together, robotic experiments can bring Frictionless Reproducibility to the physical sciences. The flywheel of open data, easy reproducibility, and competitive challenges can create a scientific singularity in many domains.
The plan
What do I mean by standardization? Making this precise is a topic for another time, but the basic idea is that each field should have several “figures of merit” with agreed-upon definitions and protocols. There should be thorough, unambiguous documentation on how to perform the experiments required to obtain these values.
The trickiest part will be building consensus within each field. The deliberation process should be representative of a diverse set of researchers; it’s important to avoid setting the bar for quality too high, since that would put well-funded researchers at an advantage.
In order to encourage researchers to abide by these rules, funding organizations could require researchers to report these metrics in at least some of their papers in order to be considered for future funding.
It may be difficult for all researchers to establish routines that are up to code. To remedy this, the government should fund specialized laboratories that accept samples from researchers and perform standardized tests.
Conclusion
Standardization is a natural process in science. Fields eventually come to a consensus about what’s important and which practices to adopt. However, there’s value in making this process more explicit.
Governments and nonprofits can induce standardization by facilitating conversations between scientists, publishing sound methodologies, and making standardized results a condition for funding.
Common protocols can bring significant benefits to specific fields of science. The first effort should center on important, established research fields with a “figure of merit” amenable to standardization.
To avoid Goodharting, these methods will have to be updated regularly and should be chosen to ensure a level playing field. Of course, too much emphasis on a single metric can lead to problems, so it would be best to incorporate slack into the system and update definitions regularly.
Incidentally, this should make it easier for scientists to share their work. No need to write a detailed experimental protocol if it’s already standard.
Standard experiments can also provide a “unit of productivity” by which to measure a scientific productivity. This should lead to more funding going to effective scientists. It also reduces the “file-drawer” problem, since researchers are incentivized to show the experiments that didn’t work in order to keep productivity numbers up.
It would be great if that happened in theoretical physics too! It is frustrating to see the current confusion about alternative theories to general relativity.