I still consider XMPP the open standard for “chat” (well, instant messaging), and I have been using Psi as an XMPP client for almost 20 years now. However, since Psi has occasionally crashed on me recently (as in: at least since Bullseye), presumably on receiving some message, I consider it a certainty that it is remotely exploitable. Given its large codebase I don't think I want to fix whatever is wrong myself, and I don't think there are still people maintaing Psi.
I therefore recently migrated to gajim last week; after all, one of the nice things about open standards is that there are usually multiple implementations. This, however, made me update an ancient hack to automatically manage my status so that I'm XMPP-offline when it's nobody's business whether or not my machine is on.
In this post, I'd like to tell you how that works, hoping it may be useful to solve other (but similar; for instance: get offline when doing talks) problems, too.
Not Always Online
First off, the major reason I'm not much of a fan of synchronous messaging (which IM is, and email is not) is that it requires some sort of “presence” notification: something needs to know whether I am online, and where I can be reached. At least in XMPP, additionally all your contacts get to know that, too.[1]
While I admit that can be useful at times, during the night and on weekends, I really don't want to publish when my computer is on and when it's not. Hence I have so far told my Psi and I am now telling my gajim to not automatically re-connect on Weekends or between 20:00 and 7:00. That I can specify this perhaps somewhat unique preference illustrates how great shell integration everywhere is. The ingredients are:
- ifupdown, Debian's native network management. If you're using systemd or NetworkManager or something, I think these use other hooks [if you've tried it, let me know so I can update this].
- D-Bus, a framework to communicate between programs sitting on a common X11 display (though with gajim, D-Bus becomes somewhat hidden).
- the shell, which lets you write little ad-hoc programlets and duct-tape together all the small utilities that accumulated in Unix since the early 1970ies (here: logger, date, and egrep).
Inter-Process Communication with D-Bus
The first thing I want to do is make tajim offline before a network interface goes down. That way, people don't have to wait for timeouts to see I am unavailable (unless someone pulls the cable or the Wifi disappears – without a network, gajim can't sign off). That means I have to control a running gajim from the outside, and the standard way to do that these days is through D-Bus, a nifty, if somewhat over-complicated way of calling functions within programs from other programs.
One of these other programs is qdbus, which lets you inspect what listens on your sessions's (or, with an option, system's) D-Bus and what functions you can call where. For instance:
$ qdbus org.gajim.Gajim /org/gajim/Gajim ... method void org.gtk.Actions.SetState(QString action_name, QDBusVariant value, QVariantMap platform_data) ...
In Psi, with a bit of fiddling, a generic D-Bus tool was enough to switch the state. Since there's a QDBusVariant in the arguments gajim's SetState method wants according to the qdbus output, I don't think I could get away with that after the migration – qdbus does not seem to be able to generate that kind of argument.
Enter gajim-remote
But gajim comes with a D-Bus wrapper of its own, gajim-remote, and with that, you can run something like:
gajim_remote change_status offline
Except that won't work out of the box. That's because gajim comes with remote control disabled by default.
To enable it, go to Preferences → Advanced, click Advanced Configuration Editor there, and then look for the remote_control configuration item. I have no idea why they've hidden that eminently useful setting so well.
Anyway, once you've done that, you should be able to change your status with the command above and:
gajim_remote change_status online
ifupdown's Hooks
I now need to arrange for these commands to be executed when network interfaces go up and down. These days, it would probably be smart to go all the way and run a little daemon listening to D-Bus events, but let me be a bit less high-tech, because last time I looked, something like that required actual and non-trivial programming.
In contrast, if you are using ifupdown to manage your machine's network interfaces (and I think you should), all it takes is a bit of shell scripting. That's because ifupdown executes the scripts in /etc/network/if-up.d once a connection is up, and the ones in /etc/network/if-down.d before it brings a connection down in a controlled fashion. These scripts see a few environment variables that tell them what's going on (see interfaces(5) for a full list), the most important of which are IFACE (the name of the interface being operated on), and MODE, which would be start or stop, depending on what ifupdown is doing.
The idea is to execute my change_status commands from these scripts. To make that a bit more manageable, I have a common script for both if-up.d and if-down.d. I have created a new subdirectory /etc/network/scripts for such shared ifupdown scripts, and I have placed the following file in there as jabber:
#!/bin/sh # State management of gajim DESKTOP_USER=msdemlei case $MODE in start) case $IFACE in eth* | wlan* | n900) if ! date +'%w/%H' | grep '[1-5]/\(0[789]\|1[0-9]\)' > /dev/null; then exit 0 fi su - $DESKTOP_USER -c 'DISPLAY=:0 gajim-remote change_status online "Got net"' > /dev/null || exit 0 ;; esac ;; stop) case $IFACE in eth* | wlan* | n900) if [ tonline == "t`su $DESKTOP_USER -c 'DISPLAY=:0 gajim-remote get_status'`" ]; then su - $DESKTOP_USER -c "DISPLAY=:0 gajim-remote change_status offline 'Losing network'" || exit 0 sleep 0.5 fi ;; esac ;; esac
After chmod +x-ing this file, I made symbolic links like this:
ln -s /etc/network/scripts/jabber /etc/network/if-down.d/ ln -s /etc/network/scripts/jabber /etc/network/if-up.d/
– and that should bascially be it (once you configure DESKTOP_USER).
Nachtrag (2023-12-02)
Let me admit that this never really worked terribly well with gajim, manly because – I think – its connections don't time out, and so once a status update hasn't worked for one reason or another, gajim would be in a sort of catatonic state. That's one of the reasons I switched on to pidgin, and its state management again broke when upgrading to Debian bookworm. My current script is near the bottom of this December 2023 post
Debugging Admin Scripts
Because it is a mouthful, let me comment a bit about what is going on:
logger Jabber: $MODE $IFACE $LOGICAL
logger is a useful program for when you have scripts started deeply within the bowels of your system. It writes messages to syslog, which effectively lets you do printf Debugging of your scripts. Once everything works for a script like this, you probably want to comment logger lines out.
Note that while developing scripts of this kind, it is usually better to just get a normal shell, set the environment variables (or pass the arguments) that you may have obtained through logger, and then run them interactively, possibly with a -x option (print all statements executed) passed to sh. For instance:
$ MODE=start IFACE=wlan0 sh -x /etc/network/scripts/jabber + DESKTOP_USER=anselmf + logger Jabber: start wlan0 + case $MODE in + case $IFACE in + date +%w/%H + grep '[1-5]/\(0[789]\|1[0-9]\)' + exit 0
– that way, you see exactly what commands are executed, and you don't have to continually watch /var/log/syslog (or journalctl if that's what you have), not to mention (for instance) bring network interfaces up and down all the time.
Case Statments in Bourne's Legacy
The main control structure in the script is:
case $MODE in start) ... ;; stop) ... ;; esac
Case statements are one of the more powerful features of descendants of the Bourne shell. Read about them in the excellent ABS in case you are a bit mystified by the odd syntax and the critically important ;; lines.
The particular case construct here is there so I can use the same script for if-up.d and if-down.d: it dispatches on whatever is in MODE. In case MODE is something other than start or stop, we silently do nothing. That is not always a good idea – programs failing without complaints are a major reason for the lack of hair on my head –, but since this isn't really user-callable, it's probably an acceptable behaviour.
General rule of thumb, though: Be wary of case .. esac without a *) (which gives commands executed when nothing else matches).
But despite this golden rule, the other case constructs in here don't have a default option either:
case $IFACE in eth* | wlan* | n900)
This is a second dispatch, this time on the interface name. That's here because these scripts are also called when a loopback interface (just connecting your machine with itself) or a tunnel interface are being brought up, and it sucks if gajim is told to connect (because it won't reach the jabber server via loopback) or disconnect (because the tunnel interfaces typically have nothing to do with my primary network connections) when one of these is brought up or down.
As you can see, you can have glob patterns in the case labels. Note that unless you set up your own interface naming (using udev), your interface names will look a lot messier in recent Debians; you would probably want patterns like en* | wl* or so. See your /etc/network/interfaces to see what you have.
By the way, it is a bit ugly that the interface list sits in the script twice, once in the start code and once in the stop code (background: originally, these have been two files). There are ways around that in bash (basically: stuff extglob patterns into a variable), but I figured that's too fancy for a script of just a few lines – and if I ever want to avoid the duplication, I'd probably create a shell function is_managed_interface rather than keep the case statement.
„Business Logic“
The meat of the matter is in the following, admittedly somewhat dense, lines:
if ! date +'%w/%H' | egrep '[1-5]/(0[789]|1[0-9])' > /dev/null; then exit 0 fi
This is the selection of when I want to be available through XMPP. How?
Well, date +'%w/%H' prints the current weekday (Sunday is 0), a slash, and the hour of the day. Right now, on a Sunday around 11:00 local time, it says:
$ date +'%w/%H' 0/11
With this, I've made my “business logic” (unfortunately, I'm not making that term up) of “stay offline on weekends and between 20:00 and 7:00” a problem of matching regular expressions – which, despite their sometimes questionable reputation, are actually a great way to specify finite-state machines, which in turn are good because they can encode logic that humans still can reason about without risking their sanity.
Here is what the grep does: the [1-5] in there means: Only ever become online from Monday (1) to Friday (5). The /(0[789]|1[0-9]) says about as much as: when the hour is one of 07, 08, or 09, or something between 10 and 19. If this seems magic to you: Do read up on regular expressions – I promise you will curse yourself for not having done that earlier.
Grep returns a 0 if it finds a match for the pattern in the argument in its input, and the shell's if construct a bit perversely executes its then branch when the command returns 0. That's why there is the bang in front of it all. The redirection to /dev/null here swallows grep's output when there is a match (not that it matters much because this script's output goes nowhere anyway). Now that I think about it, I should have used grep's -q option all those years ago; but the redirection works fine, and so I won't re-write things now.
Oh, and note I'm doing egrep here; that's because with plain grep you would have to use backslashes to give parentheses and the vertical bar their special regular expression meaning.
Switching Users, Finding the Session bus
Finally, the particularly unsightly:
su - $DESKTOP_USER -c 'DISPLAY=:0 gajim-remote change_status online "Got net"' || exit 0
is the reason why, these days, I'd probably run something on the D-bus directly; that way, I would not have to guess a user and a display where gajim might be running.
What's going on here is that I first change to the user that's likely running the X server (the su -). This is because this script is being executed as root, but I want to connect to the D-bus as the user that has the display. I'll have to give the command to execute in a -c option, and to hold that command together, I'm using single quotes, which leaves me double quotes within the command to run.
A user can have multiple busses, but gajim will only listen to one of them, the one attached to its X display[2]. Unless you're doing something really clever, that display will be :0, so that's what I pass as environment into gajim-remote.
Finally the || exit 0 is idiomatic shell for “if this fails, then return 0 to the calling process”. That's important here because when an interface script returns non-zero, the interface will not be brought up. That certainly wouldn't be appropriate just because something is wrong with jabber.
You better not sleep
When bringing the interface down, there's a && sleep 0.3 sprinkled in, which is idiomatic for “if the previous command is successful, sleep for a moment”.
And it is a demo for the dangers of shell-scripting. What I want is that gajim has time to actually update my status on the server before the network connection is going away. Of course, blindly sleeping a fixed amount of time not a good way to do that. Most of the time, sign-off will be faster and I'm wasting time when bringing the interface down (which you will usually do when putting the machine to sleep). But perhaps for one reason or another it may be slower occasionally, and then I'll yield too early. A much better way would be to wait (with a timeout) for gajim to confirm it has deregistered me and then immediately continue.
But then, worse is sometimes better – this code has served me well for 15 years, and as long as there's not too many sleep calls in all the scripts called while the machine goes to sleep, the simplicity of just sitting still for 300 ms is marginally worth it. Just don't ever force something like that on people on whose machines sleeps of this kind might accumulate.
In case you're wondering: explicitly bringing the interfaces down when the machine goes to sleep is another topic for mild shell scripting. But this post is already too long, so if you really want to know what I'm doing about this: Please write in, and I'll be happy to write a (much shorter) post on that.
[1] | And not necessarily only your contacts: While I've mercifully repressed all the details, I distinctly remember the embarrassed laughter during a conference talk when someone had their display cloned to a projector and a notification came up that someone else who was supposed to be unavailable just connected to skype or some other proprietary IM thingy, making it clear they hadn't been all that unavailable at all. |
[2] | I freely admit to having no idea how this kind of thing works with Wayland; I'm not saying they'll have to pry X11 from my cold, dead hands, but so far I can't say the story for migrating to Wayland is terribly convincing. |
Zitiert in: Another Bookworm Regression: D-bus, X11 Displays, purple-remote, Oh My!