Deconstructing the Mastodon client
Ever since I joined Twitter in 2011, and then moved to Mastodon in 2022, I have been unhappy with the timeline view proposed by both of these communication platforms as their main interface. Now I have finally done something about it: I wrote my own Mastodon client. Or perhaps rather a non-client, because the concept of "the client" is a big part of what I disliked.
My use of social networks can be broken down into three categories:
- conversations, mostly public but sometimes private
- keeping up to date with the work of a small number of people or institutions
- staying in touch with communities I consider myself a part of, and following topics I find interesting
These are not clearly separated categories. It's often messages from category 2 that start conversations, and occasionally messages from category 3. But most of my daily use of Mastodon consists of
- participating in ongoing conversations
- reading the feeds of accounts I care about specifically
- scanning all the other news feeds sporadically and often superficially, depending on how much time and interest I have at the moment
A timeline view mixing all messages from all accounts I follow is somewhat acceptable for (3), but no good for (1) and (2). Mastodon proposes lists for (2), and notifications to help with (1), but neither mechanism is satisfying for me. Lists in particular suffer from an awkward user interface. Moreover, I do (3) exclusively on mobile devices (on the bus etc.), (1) almost exclusively on the desktop (as I don't like typing on on-screen keyboards), and (2) alternating between multiple devices.
There are, of course, many Mastodon clients, so I tried out a few of them. For a while, I used Fedilab on Android (for me: phone and e-ink tablet) for activity (3), and the default Web client and/or Elk, mainly on the desktop, for (1) and (2). It was a workable setup, but not a satisfying one. In addition to the cumbersome list interface, what I found missing was synchronization between my usage of multiple devices For (2), I'd need to be able to efficiently access all messages I hadn't seen before, on any of my devices (two mobile, two desktop). As a long-time Emacs user, I also tried mastodon.el, which is nice but, like Emacs, it is desktop only, and thus doesn't help with my multi-device issues.
At some point I realized that what I wanted is not a better Mastodon client, but a better Mastodon workflow. What I care about is a data structure, a stream of toots, that is accessible via an HTTP API. I want to split this stream into several streams according to various criteria. For some substreams, I want to make sure I don't miss any message. For others, I need an interface to scan all messages when I feel like it, or search for specific keywords when I don't have time for scanning everything.
Can I get such interfaces to Mastodon streams without writing my own client? Yes, by repurposing existing software. Small streams of which I don't want to miss anything are much like e-mail (after spam filtering of course!). High-volume streams that I scan or search are much like RSS feeds. There is a lot of good software for managing e-mail and RSS feeds, for all platforms I use and even exotic platforms that I don't use (yet?). There are also good infrastructure tools in this space, in particular for e-mail. isync, for example, takes care of IMAP(S), letting me work with local files (Maildir) and not worry about networks, certificates, and their various modes of failure.
It actually takes surprisingly little software to transform Mastodon streams into e-mail and RSS feeds, if you can resist temptations of overengineering. A toot is a snippet of HTML with optional attachments (images, video, audio). That's also what a MIME message happens to be. A near-perfect match. RSS items are HTML snippets as well. No attachments, but you can include the same preview images that Mastodon clients display with toots. If you can find support libraries for mail, RSS, and the Mastodon API in a programming language that you know well enough, this becomes a very manageable side project.
If your preferences match mine, meaning you are happy to use Common Lisp for such a job, you can use my code as a starting point for your own Mastodon experiments. Its main support libraries are tooter for the Mastodon API, and mel-base for e-mail. RSS is trivial if you have XML support, for which I use plump. My RSS aggregator is Newsblur, which has a reasonable Web interface for the desktop and a very nice Android app. For e-mail, I use K9 on Android, and Emacs on the desktop, but I am pretty sure any other e-mail client would work fine as well. The most time-consuming aspect turned out to be mel-base, a library that's insufficiently documented and not quite up to date, lacking support in particular for subject lines and account names containing Unicode characters.
If you have followed so far, you have probably noticed that my non-client supports nothing but reading toots. Each of my transformed toots ends with a link that opens it in the default Web client, where I can reply, boost, or like. The Web client is also what I use for administrative tasks. Bonus: I add another link to each toot that opens it in the instance of its author, where I have access to the full reply chain, of which my own instance often captures only a subset. A very simple solution to one of Mastodon's unfortunate limitations that are due to federation.
The hopefully generalizable lesson from this project is that it is possible to improve one's personal computing environment with reasonable effort, under the condition of accepting an initial learning curve for some technologies. The important question then is how to identify technologies that are worth learning, which I interpret as technologies that are likely to be useful again for other software personalization efforts. A first draft of a list of criteria:
- Choose boring technology. You want well-known, well-documented, and stable infrastructure to build on. No surprises, no tech churn. Your learning effort should be a good investment.
- Choose small-scale rather than enterprise-grade technology. Your problems and challenges are very different from Microsoft's. Prefer small software stacks.
- Corollary 1: choose carefully who you turn to for advice. Most conference talks, blog posts, StackOverflow discussions, etc. come from software professionals. Better listen to people like yourself (but no, I have no advice on where to find them, nor how to judge their competence).
- Corollary 2: consider old technology. Most modern software development tools are designed for software professionals. Tools for small-scale development were common in the 1980s and 1990s, before computers became commodities. Technology from that era that's still supported today may well be your best bet. I am a happy user of Emacs, Smalltalk (more precisely Pharo with Glamorous Toolkit as my preferred user interface), and Common Lisp (more precisely SBCL). Python is from the 1990s as well, but since it was widely adopted by software professionals in the 2000s, its ecosystem suffers too much from tech churn for my taste.
- Build on general protocols and file formats rather than specialized ones. Hierarchical filesystems rather than the Dropbox API. E-mail rather than Matrix. HTML, XML, and JSON files rather than JavaScript libraries or Web APIs.
- Consider debuggability. Delegate hard-to-debug stuff (e.g. networking, in particular with encryption) to other software. Choose tools that support debuggability. Debugging is a lot easier if you can build your own problem-specific debugging tools, which in turn is best supported by development tools that are extensible and focus on rapid feedback. Smalltalk systems are best in class in this respect, and Glamorous Toolkit even turned this into a design principle, called "Moldable Development".
Unfortunately, there is one more aspect to making good choices that is hard to generalize: you need some expertise in figuring out which problems you can solve yourself with reasonable effort and which are so hard that your efforts are better spent on delegating or circumventing them. Data synchronization is in this second category, but like most people I learned this the hard way (years ago), while trying to do it myself and losing both time and data in the process.
After a few weeks of using my setup, I am fully satisfied with it. I also note that my original ideas about defining my personal algorithmic feeds have evolved substantially with practical experience. Once I have taken care of conversations (they go to e-mail) and the small set of accounts I follow closely (a low-volume RSS feed), I ended up splitting the remaining toots (i.e. most of my timeline) by topics in the crudest imaginable way: substring search. It's not perfect but definitely good enough. There's always room for improvement. My main failure so far is in removing all the cat-related toots from my feeds. That may actually require AI-based image recognition. Some problems are hard!
I'd love to hear about similar projects in this space (tell me on Mastodon!). The only one I am aware of is Jon Udell's Steampipe-based client. Steampipe provides an SQL/database view on many Web services, which is perfect for doing non-trivial queries. That's something my own setup doesn't address at all. It's not something I feel a need for right now, but I may well add Jon's client to my toolbox one day.