Going for robustness: science

polycrisis, science

This is a follow-up to my earlier post entitled "Going for robustness", focusing on scientific research.

What is "robust science"? I see at least two interpretations, and I am going to discuss both of them: robustness of scientific findings, and robustness of the process of doing science, which includes in particular the robustness of the web of scientific research institutions: first and foremost universities and research labs, but also learned societies, funding agencies, publishers, etc.

Robust knowledge

"Reliable Knowledge: an exploration of the grounds for belief in science" is a book by physicist-turned-philosopher-of-science John Ziman. It's a good book, which I recommend every scientist to read. Its title is a very good definition of the central goal of science: obtaining knowledge that is reliable. Or robust, which means the same in this context. Reliable or robust knowledge is knowledge that has been subjected to various quality control processes: verification, empirical tests, cross-checks, etc. Such knowledge can be expected to remain valid even if one of its underpinnings is later discovered to be shaky. And that's a highly desirable quality for any knowledge that important decisions are based on.

The quality control processes in science require robust technology for making observations, and for communicating and preserving these observations. Plus robust technology for creating, updating, and applying the scientific models we make to summarize and explain observations. Perhaps less obviously, quality control also requires a shared understanding of all this technology within research communities, in order to ensure that scientists can judge the reliability of a scientifc instrument, the domain of applicability of a computational method, the limits of validity of a scientific model, and a myriad of other characteristics of both the subjects and the research techniques of any given domain of scientific enquiry. In particular, such a shared understanding is the basis for a sound judgement of the degree of robustness of scientific findings.

The replicability crisis has demonstrated that the quality control processes of science have started to fail. It is often interpreted as a sign of diminishing quality of published work, due to negligence, incompetence, or outright fraud. While all of these certainly happen, it is important not to overlook the most important aspect of the crisis: an overestimation of how replicable published scientific work can be expected to be. It is the unexpected irreplicability of many results that turned its discovery into a crisis. After all, while the ultimate goal of science is robust knowledge, this does not mean that each individual published result must be robust. Much of the quality control happens after publication, on much longer time scales, based on the confrontation of many different findings with overlapping applicability. In my view, the main lesson from the replicability crisis is that many scientific disciplines lack a sufficient shared understanding of their techniques. What supports this view is the notable absence of the oldest and most mature domains of research, experimental and theoretical physics and chemistry, from the replicability crisis. There are of course non-replicable results in these domains, but they dealt with routinely and hardly ever make the news. Younger disciplines lack this level of maturity, as do the recent computational branches of physics and chemistry.

One of the underappreciated challenges to constructing shared understanding is the use of computers. Like democracy (see my last post), science has been steamrolled by information technology, which has developed so fast that the quality control mechanisms have not been able to adapt. Computers and software have given us shiny new tools that are relatively easy to apply but very hard to understand. Whereas sophisticated statistical inference methods once required collaborating with a trained statistician, in addition to access to very expensive computers, they are now menu entries in software that runs on everybody's desktop. Should we then be surprised that many of the horror stories of the replicability crisis involve sophisticated statistical methods?

Likewise, the shift in the 1990s from hardware and software custom-made for science to commodity technology primarily designed for commercial applications has seriously reduced scientists' understanding of and agency over computational methods and tools. For an in-depth discussion, see my talk at SPLASH'24 and the paper that goes with it. With the technology of the tech industry, science has also adopted beliefs and attitudes from the tech industry that are in blatant opposition to the principles of science. We have accepted as normal that models and methods implemented in software are opaque and exempted from scientific quality control. We have also accepted as normal that a small number of software developers can impose their ideas of methodological progress on a majority of computationally illiterate users through compulsory software updates.

In order to make science robust again, one important goal to aim for is digital sovereignty for science. This doesn't mean that we shouldn't adopt commodity technologies, but we must do so critically, and tweak them to our needs. That is something that we can all start to do individually, at a small scale, if we are willing to give up some productivity in exchange. Critically examining technology takes time and effort, as does tweaking or replacing it. If you choose this path, you will probably produce fewer papers. It's therefore not a realistic option for early-career researchers, but established scientists can make this choice. As I said in my last post, the most important step is to start doing something, no matter how small. Replace a piece of proprietary software by an Open Source alternative. Get more familiar with the Open Source software you use. Establish contact with its developers, if only to tell them what support you would need from them in order to become a more responsible user. Consider the technological choices made by the developers, and in particular if they respect the needs of science: transparency and reviewability. Next, learn enough about your software that you can judge its robustness (more on that in the next episode in this blog posts series). And, more generally, do whatever it takes to improve your computational literacy.

However, individual action will only get us so far. Science is fundamentally a collective process, in which we stand on the shoulders of giants to see further, and look over each other's shoulders to to check for mistakes and biases. Information technology has steamrolled these processes as well. It has enabled larger and more complex research projects, gathering ever larger and more diverse teams. That is progress in a way, because it allows us to tackle more difficult questions. However, the quality control processes of the 1950s are not adapted to such projects. A paper written by an interdisciplinary team of 15 scientists is still sent for review by two or three individuals doing their examination in isolation. They have neither the competence nor the time required to perform a thorough review.

To make science robust again, we need to update our quality control processes. As a rule of thumb, reviewing a paper requires the same mix of competences as doing the work described in it. Reviewing interdisciplinary work requires an interdisciplinary team. Yes, that takes more time and effort. It will reduce our productivity. The same can be said for reviewing software and machine-learning techniques, a task for which we first have to develop appropriate processes. I have described a few possible steps in this preprint. All of them will be easier to implement if more researchers are more familiar with the medium of software, so the individual actions outlined above matter here as well.

An important obstacle to digital sovereignty for science is the near-complete lack of interest in techniques and infrastructure for the digital era demonstrated by today's scientific institutions. It parallels the lack of interest for the digital transformation shown by governments, and that is probably not a coincidence: most of scientific research today is organized and funded by institutions that directly depend on government funding. Governments in the Western world have decided long ago to leave the digital sphere to "the market", meaning in practice a handful of corporations. We shouldn't expect them to make a different decision for the scientific institutions that they oversee. Which leads me to the second aspect of robust science: the robustness of our institutions and the research processes managed by them.

Robust processes

Recent events in the USA have illustrated how dependent today's scientific research is on government decisions. Even though research funding is rather diversified in the USA, compared to more centralized countries such as France, very few institutions can continue business as usual if the federal government decides to withdraw its funding.

This critical dependence on state funding for research is a rather recent phenomenon. In the early days of science, in the 16th century, science was much like art in that its practitioners had to be either wealthy or have wealthy sponsors. In exchange, they had a lot of freedom in their work, being bound only by the rules that were formulated by the emerging scientific community. As science progressed on its own growth path, it attracted the attention, and money, of more and more people interested in what we now call applied science, i.e. research done with the goal of enabling change in the world. Science, capitalism, and industry are in fact closely interwtined, and are all core processes of the era that sociologists call modernity.

Science got its growth boost after World War II, when governments started to adopt economic growth as a goal they should actively support in order to make their countries more competitive on the international markets. They began investing heavily in scientific research, both fundamental and applied. The growth of science has thus been intimately related to economic growth for more than half a century. Just like the recent adoption of commercial technology lead to a tacit acceptance of the social values of the tech industry, the motivation of doing science for industrial growth in the 20th century lead to a tacit acceptance of the values of industry, in particular the values of efficiency and productivity.

Then quantitative management took over industry and science. Just like economic growth was quantified as growth of GDP, scientific productivity was quantified via bibliometry, with similarly perverse consequences. The pressure for productivity and impact has been steadily increasing over recent years. The recent drastic funding cuts in the USA can be seen as a move to weaken the political opposition, but it also makes sense economically: why fund research if you have already decided to ignore its outcomes? In France, we see similar plans to cut down public research and keep only "the best" (see this article for a summary in English), even though the political discourse justifying these plans is very different in both style and arguments. In light of these recent developments, the scientific institutions that welcomed the massive post-WWII state investments in research signed a deal with the devil, in that they became very dependent on government politics. That's as fragile as it can get.

On the other hand, the question of which problems society should allocate resources to for scientific investigation is undeniably a political one. One way to make scientific institutions more robust while orienting their work towards questions of societal interest is to anchor science more firmly in the public sphere. Imagine, for example, universities adding "consulting and counseling" to their missions, along with the traditional missions of teaching, scholarship, and research. Actors from the public sphere, such as associations, municipal councils, or even school classes, could book an appointment to discuss the scientific investigation of a question they care about. Importantly, this is not research as a service. It's the people who care about the question that would also do much, maybe most, of the research work, with professional scientists accompanying and advising them. This isn't actually a revolutionary idea, it's already practiced at very small scales in some citizen science projects. The major novelty in my proposal is to make this an official and very visible part of a university's missions. In the long run, it would increase awareness of and knowledge about science in the population, but also awareness of and knowledge about societal issues among professional scientists. Funding would still come mostly from public money, but not necessarily from the most centralized level, i.e. national governments. Setting this up is above the means of an individual, but a single university could take such a step towards robustness. Robustness of its own operation in the short term, and robustness of science as a social process in the long run.

Anchoring science more firmly in the population and in all our institutions is not only a path to more robust science, but also to a more robust society. Looking beyond the technicalities of scientic research, the foundation of science is the belief in a shared reality to which nobody has full access, but which we can understand as a collective by maintaining epistemic humility and checking each other's work and affirmations. That's a good antidote to the fake news and tribalist rhetorical warfare that authoritarian regimes thrive on. I believe that it is also a better basis for deliberating on political issues than what we to today in representative democracies: delegating all decisions to a small elite that excels in radiating certainty on topics that they understand at best superficially. And maybe, just maybe, this could also be a step towards dealing with existential societal problems such as the transgression of planetary boundaries.

DOI: 10.59350/x6ny8-vva79

Comments

With an account on the Fediverse (e.g. Mastodon), you can comment by replying to this post. Non-private replies are displayed below.

← Previous