Pharo year one

2019-12-31 computational science

It's the season when everyone writes about the past year, or even the past decade for a year number ending in 9. I'll make a modest contribution by summarizing my experience with Pharo after one year of using it for projects of my own.

My first contact with Pharo happened a bit more than one year ago, when I signed up for the Pharo MOOC in October 2018. But following a MOOC means working on exercice problems defined by someone else. Getting a real feeling for a programming system requires moving on to problems you actually care about. That's why I started three Pharo-based projects in 2019. The main one is the Pharo edition of ActivePapers, the other ones are an exploration of the Interplanetary File System (IPFS) and a second implementation of my digital scientific notation Leibniz. In all these projects, the user interface is an important aspect, because that's one of my major motivations for using Pharo. However, instead of the standard Pharo user interface framework, which is an evolution of the original Smalltalk user interface of the 1980s, I used the Glamorous Toolkit, a complete redesign with many interesting new ideas. Perhaps the most significant innovation in the Glamorous Toolkit from my perspective is the introduction of a computational document. It resembles the fashionable computational notebooks in many ways, but differs in being an integral part of a live programming system.

As I wrote in my initial blog post on Pharo, I started out by exploring the system using the tools it provides for that purpose. In retrospect, this is clearly the strongest aspect of Pharo. The combination of code browsers, code search, object inspection, and execution inspection (via a tool misleadingly called a debugger) is an extremely powerful way to understand complex software systems. The best evidence is that I was able to write useful and non-trivial extensions to the Glamorous Toolkit, which still is rapidly evolving alpha-stage software and, judged by standard metrics such as lines of documentation per line of code, badly documented. But such metrics make no sense in a system in which searching the code base is faster than documentation lookup in standard environments. Going back to such environments after working with Pharo is a very frustrating experience.

Note that I am not saying that the Pharo environment is perfect. For my taste it requires way too much mouse use. I am still much more productive in Emacs than in Pharo for tasks supported by both, mainly because I can keep my hands on the keyboard. I also find the standard code browser in Pharo too limiting in only showing one method at a time. The Glamorous Toolkit is a clear improvement in that respect. But all the criticism I can come up with is about details that can be fixed, whereas the main defects that I now see in almost every other software development environment is much more fundamental: they suffer from a barrier that separates development tools on one side from the code under development on the other side.

Similar remarks apply to the Smalltalk language on which Pharo is built. It's a minimal programming language that puts its object system in center stage and pushes as many features as possible into its libraries. That's certainly an interesting point in design space to explore, but I'd personally prefer to have a couple of important concepts (for example immutable objects) as language features, rather than as implementation details of class hierarchies. But then, no language is perfect, and Smalltalk is certainly good enough for my needs.

The most serious problem that I have with Pharo is that I don't see how I could use it productively for my own research in computational biophysics in the near future. There is a small computational science community around Pharo (see e.g. this list of scientific libraries), but most of the infrastructure code that I'd need is missing. Moreover, Pharo evolves too rapidly for the kind of computational research that I do (see my critique of the SciPy ecosystem for some background information). Finally, reproducible computations remain a challenge because there isn't much of a support infrastructure for reproduciblity in Pharo so far, although the recent work on bootstrapping is an important first step.

On a longer time scale, I can imagine Pharo replacing Emacs as my main user interface to computing, with the hard-core science written in different languages but interfaced to Pharo. I expect IPFS to play an important role at the cross-language interface, for various reasons that deserve an entire blog post on their own. However, it takes a lot of not-yet-written code to get there. Too much to define this as a realistic goal for myself. This means that my future use of Pharo mainly depends on the directions taken by the Pharo community over the coming years. I am pretty sure that Pharo will remain an important tool in my toolbox, I just don't know what its exact role will be.