An open letter to software engineers criticizing Neil Ferguson's epidemics simulation code

scientific software

Dear software engineers,

Many of you were horrified at the sight of the C++ code that Neil Ferguson and his team wrote to simulate the spread of epidemics. I feel with you. The only reason why I am less horrified than you is that I have seen a lot of similar-looking code before. It is in fact quite common in scientific computing, in particular in research projects that have been running for many years. But like you, I don't have much trust in that code being a faithful and trustworthy implementation of the epidemiological models that it is supposed to implement, and I don't want to defend bad code in science.

However, many of your specific criticisms show a lack of familiarity with today's academic research. This code is not the sole result of 13 years of tax-payer-funded research. The core of that research is building and applying the model it implemented by the code, the code itself is merely a means to this end. The scientists who wrote this horrible code most probably had no training in software engineering, and no funding to hire software engineers. And the senior or former scientists who decided to give tax-payer money to this research group are probably even more ignorant of the importance of code for science. Otherwise they would surely have attributed money for software development, and verified the application of best practices.

But the main message of this letter is something different: it's about your role in this story. That's of course a collective you, not you the individual reading this letter. It's you, the software engineering community, that is responsible for tools like C++ that look as if they were designed for shooting yourself in the foot. It's also you, the software engineering community, that has made no effort to warn the non-expert public of the dangers of these tools. Sure, you have been discussing these dangers internally, even a lot. But to outsiders, such as computational scientists looking for implementation tools for their models, these discussions are hard to find and hard to understand. There are lots of tutorials teaching C++ to novices, but I have yet to see a single one that starts with a clear warning about the dangers. You know, the kind of warning that every instruction manual for a microwave oven starts with: don't use this to dry your dog after a bath. A clear message saying "Unless you are willing to train for many years to become a software engineer yourself, this tool is not for you."

As a famous member of your community famously said, software is eating the world. That gives you, dear software engineers, a lot of power in modern society. But power comes with responsibility. If you want scientists to construct reliable implementations of models that matter for public health decisions, the best you can do is make good tools for that task, but the very least you must do is put clear warning signs on tools that you do not want scientists to use - always keeping in mind that scientists are not software engineers, and have neither the time nor the motivation to become software engineers.

Consider what you, as a client, expect from engineers in other domains. You expect cars to be safe to use by anyone with a driver's license. You expect household appliances to be safe to use for anyone after a cursory glance at the instruction manuals. It is reasonable then to expect your clients to become proficient in your work just to be able to use your products responsibly? Worse, is it reasonable to make that expectation tacitly?

Some of you have helped with a first round of code cleanup, which I think is the most constructive attitude you can adopt in the short term. But this is not a sustainable approach for the future. We can't ask software experts for a code review every time we do something important. We computational scientists need you software engineers to help us build a better future for computer-aided research. Which means pretty much all research, because software has been eating science as well for a while. Can we count on your help?


PS added 2020-05-19T10:30: This post has provoked a lively discussion not only in the comments below but also on Twitter. There are way too many comments for me to reply to each one individually, so I decided to address recurrent topics in this follow-up.

Many people seem to have read my post as putting the main responsibility for the problems related to the cited simulation code on software engineers. This was most certainly not my intention. Scientists, policy makers, and journalists have all contributed to a less than satisfactory outcome. My open letter is clearly addressed at a particular group of people (software engineers criticizing the Imperial College Covid-19 simulations on the basis of code quality) and clearly states its focus on the role of software technology, which is what the target audience seems to overlook. A focus is always an arbitrary choice of an author for the sake of brevity or clarity. A glance at the rest of my blog should suffice to show that I do consider computational scientists responsible for their technological choices and their consequences. However, my main intention was not assigning blame for events in the past, but outline what needs to change to prevent similar events in the future.

The car analogy was another frequent target of critical comments. Cars are a mature technology, in which many professions (engineers, workers, mechanics, driving instructors, drivers, etc.) have well-defined roles and everyone involved has a general understanding of the role of everyone else. Software is an immature technology in which roles remain fuzzy and everyone has an even fuzzier view of which other roles exist and who fills them. The discussion of my open letter has provided ample evidence for this all-encompassing fuzziness. What we collectively need to work on is turning software into a mature technology. That requires all stakeholders to make their own role views explicit and then negotiate shared role definitions with everyone else. Several commenters have pointed out the emergence of research software engineers (RSEs) as a sign for progress, and I completely agree. But even the role of RSEs remains fuzzy at this time. Should they work a collaborators on research projects, with a particular specialization? Or as occasional consultants or service providers to researchers? Their interaction with the software engineering universe is even less clear. For now it is mostly one-way in that RSEs bring software technology from the outside into research labs. What my letter argues for is an action in the opposite direction: make software technology evolve to adapt to the specific needs of scientists. A big problem is culture clash. In academia, scientists are traditionally on top of the power pyramid and are used to everyone else working for them (even though the top position is now held by managers, but that's a different story). In the tech world, it's software engineers who are kings and used to everyone else, including their clients, obeying their directives. In the worst case, RSEs might find themselves trapped in the valley between two power pyramids. In the ideal case (from my point of view), they will be diplomats working towards a merger of the two kingdoms, with a simultaneous transformation into a democracy.

Comments retrieved from Disqus

← Previous Next →