The plea for stability in the SciPy ecosystem that I posted last week on this blog has generated a lot of feedback, both as comments and in a lengthy Twitter thread. For the benefit of people discovering it late, here is a summary of the main arguments and my reply to them.
Two NumPy-related news items appeared on my Twitter feed yesterday, just a few days after I had accidentally started a somewhat heated debate myself concerning the poor reproducibility of Python-based computer-aided research. The first was the announcement of a plan for dropping support for Python 2. The second was a pointer to a recent presentation by Nathaniel Smith entitled “Inside NumPy” and dealing mainly with the NumPy team’s plans for the near future. Lots of material to think about… and comment on.
It’s hard to find an aspect of modern life that is not influenced in some way by software. Some of it is very visible, for example the Web browser I start on my computer. Other software is completely invisible, such as the software controlling my car’s diesel engine. Some software is safety critical, for example flight control software in airplanes. Other software is used in a much more futile way, such as playing games. I could go on listing characteristics in which different software packages differ, but I will leave it at that - I don’t really expect anyone to disagree about the ubiquity and diversity of software in our increasingly digital world.
A few days ago, I noticed this tweet in my timeline:
I 'still' program in C. Why? Hint: it's not about performance. I wrote an essay to elaborate... appearing at Onward! https://t.co/pzxjfvUs5B— Stephen Kell (@stephenrkell) September 5, 2017
That sounded like a good read for the weekend, which it was. The main argument the author makes is that C remains unsurpassed as a system integration language, because it permits interfacing with “alien” code, i.e. code written independently and perhaps even in different languages, down to assembly. In fact, C is one of the few programming languages that lets you deal with whatever data at the byte level. Most more “modern” languages prohibit such interfacing in the name of safety - the only memory you can access is memory allocated through your safe language’s runtime system. As a consequence, you are stuck in the closed universe of your language.
Over the last few years, I have repeated a little experiment: Have two scientists, or two teams of scientists, write code for the same task, described in plain English as it would appear in a paper, and then compare the results produced by the two programs. Each person/team was asked to do a maximum amount of verification and testing before comparing to the other person’s/team’s work.
A few years ago, I decided to adopt the practices of reproducible research as far as possible within the technical and social constraints I have to live with. So how reproducible is my published code over time?
In discussions about computational reproducibility (or replicability, or repeatability, according to the preference of each author), I often see the argument that reproducing computations may not be worth the investment in terms of human effort and computational resources. I think this argument misses the point of computational reproducibility.
Two currently much discussed issues in scientific computing are the sustainability of research software and the reproducibility of computer-aided research. I believe that the communities behind these two ideals should work together on taming their common enemy: software collapse. As a starting point, I propose an analysis of how the risk of collapse affects sustainability and reproducibility.
The importance of reproducibility in computer-aided research (and elsewhere) is by now widely recognized in the scientific community. Of course, a lot of work remains to be done before reproducibility can be considered the default. Doing computational research reproducibly must become easier, which requires in particular better support in computational tools. Incentives for working and publishing reproducibly must also be improved. But I believe that the Reproducible Research movement has made enough progress that it’s worth considering the next step towards doing trustworthy research with the help of computers: verifiable research.
Think of all the things you hate about using computers in doing research. Software installation. Getting your colleagues’ scripts to work on your machine. System updates that break your computational code. The multitude of file formats and the eternal need for conversion. That great library that’s unfortunately written in the wrong language for you. Dependency and provenance tracking. Irreproducible computations. They all have something in common: they are consequences of the difficulty of composing digital information. In the following, I will explain the root causes of these problem. That won’t make them go away, but understanding the issues will perhaps help you to deal with them more efficiently, and to avoid them as much as possible in the future.