When I reported on what broke when I upgraded to Debian bookworm, I overlooked that my jabber presence management (where I'm offline at night and on weekends) no longer worked. Figuring out why and fixing it was a dive into D-Bus and X11 that may read like a noir detective novel, at least if you are somewhat weird. Let me write it up for your entertainment and perhaps erudition.
First off, against the March post, I have migrated to pidgin as my XMPP (“jabber”) client; at its core, presence management still involves a script in /etc/network/if-*.d where I used to call something like:
su $DESKTOP_USER -c "DISPLAY=:0 purple-remote getstatus"
whenever a sufficiently internetty network interface went up or down, where DESKTOP_USER contains the name under which I'm running my X session (see below for the whole script with the actual presence-changing commands).
Purple-remote needs to run as me because it should use my secrets rather than root's. But it was the DISPLAY=:0 thing that told purple-remote how to connect to the pidgin instance to interrogate and control. As most boxes today, mine is basically a single-user machine (at least as far as “in front of the screen” goes), and hence guessing the “primary” X display is simple and safe.
Between X11 and the D-Bus
That purple-remote needed the DISPLAY environment variable was actually almost a distraction from the start. There are many ways for Unix programs to talk to each other, and DISPLAY might have pointed towards 1980ies-style X11 inter-client communication. But no, the purple-remote man page alreads says:
This program uses DBus to communicate with Pidgin/Finch.
Correctly spelled D-Bus, this is one of the less gruesome things to come out of the freedesktop.org cauldron, although it is still riddled with unnecessarily long strings, unnecessarily deep hierarchies, and perhaps even unnecessary use of XML (though I feel sympathies in particular for that last point).
But that's not what this post is about. I'm writing this because after upgrading to Debian bookworm, purple-remote no longer worked when used from my if-up.d script. Executing the command in a root shell (simulating how it would be called from ifupdown) showed this:
# DESKTOP_USER=anselm su $DESKTOP_USER -c "DISPLAY=:0 purple-remote getstatus" No existing libpurple instance detected.
A quick glance at the D-Bus Specification gives a hint at how this must have worked: dbus-launch – which is usually started by your desktop environment, and my case by a:
export $(dbus-launch --exit-with-x11)
in ~/.xinitrc – connects to the X server and leaves a “property” (something like a typed environment variable attached to an X11 window) named _DBUS_SESSION_BUS_ADDRESS in, ah… for sure the X server's root window [careful: read on before believing this]. As the property's value, a D-Bus client would find a path like:
unix:path=/tmp/dbus-1cAbvsX6FD,guid=795a0d...
and it could open that socket to talk to all other D-Bus clients started within the X session.
Via apropos to xprop to Nowhere
So… Does that property exist in the running X server? Hm. Can I figure that out without resorting to C programming? Let's ask the man page system:
$ apropos property [..lots of junk...] xprop (1) - property displayer for X [...]
Typing in man xprop told me I was on the right track:
$ man xprop SYNOPSIS xprop […] [format [dformat] atom]* SUMMARY The xprop utility is for displaying window and font properties in an X server. OPTIONS […] -root This argument specifies that X's root window is the target win‐ dow. This is useful in situations where the root window is completely obscured.
So, let's see:
$ xprop -root _DBUS_SESSION_BUS_ADDRESS _DBUS_SESSION_BUS_ADDRESS: not found.
Hu? Has dbus-launch stopped setting the property? Let's inspect Debian's change log; a major change like that would have to be noted there, wouldn't it? Let's first figure out which package to look at; the documentation then is in /usr/share/doc/<packagename>:
$ dpkg -S dbus-launch dbus-x11: /usr/bin/dbus-launch $ zless /usr/share/doc/dbus-x11/changelog.Debian.gz
Looking for “property” or “BUS_ADDRESS” in there doesn't yield anything; that would make it unlikely that the property was somehow dropped intentionally. I have to admit I had halfway expected that, with something like “for security reasons”. But then if someone can read your root window's properties, access to your session bus is probably the least of your problems.
Still, perhaps someone is slowly dismantling X11 support on grounds that X11 is kinda uncool? Indeed, you can build dbus-launch without X11 support. If the Debian maintainers built it that way, the respective strings should be missing in the binary, but:
$ strings `which dbus-launch` | grep _DBUS_SESSION _DBUS_SESSION_BUS_PID _DBUS_SESSION_BUS_ADDRESS _DBUS_SESSION_BUS_SELECTION_
No, that's looking good; dbus-launch should still set the properties.
Skimming the Docs is Not Reading the Docs.
If I did not see the property a moment ago, perhaps I have used xprop the wrong way? Well, actually: I didn't read the D-Bus spec properly, because what it really says is this:
For the X Windowing System, the application must locate the window owner of the selection represented by the atom formed by concatenating:
- the literal string "_DBUS_SESSION_BUS_SELECTION_"
- the current user's username
- the literal character '_' (underscore)
- the machine's ID
– and then find the _DBUS_SESSION_BUS_PID on the window owning that selection. The root window thing was my own fantasy.
If you bothered to skim the ICCCM document I linked to above, you may recognise the pattern: that's just conventional X inter-client communication – no wonder everyone prefers D-Bus.
This is beyond what I'd like to do in the shell (though I wouldn't be surprised if xdotool had a hack to make that feasible). I can at least establish that dbus-launch still produces what the spec is talking about, because the “atoms” – a sort of well-known string within the X server and as a concept probably part of why folks are trying to replace X11 with Wayland – are all there:
$ xlsatoms | grep DBUS 488 _DBUS_SESSION_BUS_SELECTION_anselm_d162... 489 _DBUS_SESSION_BUS_ADDRESS 490 _DBUS_SESSION_BUS_PID
The Next Suspect: libdbus
Given that, dbus-launch clearly is exonerated as the thing that broke. The next possible culprit is purple-remote. It turns out that's a python program:
$ grep -i dbus `which purple-remote` import dbus obj = dbus.SessionBus().get_object("im.pidgin.purple.PurpleService", "/im/pidgin/purple/PurpleObject") purple = dbus.Interface(obj, "im.pidgin.purple.PurpleInterface") data = dbus.Interface(obj, "org.freedesktop.DBus.Introspectable").\
So, this is using the python dbus module. Let's see if its changelog says anything about dropping X11 support:
$ zless /usr/share/doc/python3-dbus/changelog.Debian.gz
Again, nothing for X11, property, or anything like that. Perhaps we should have a brief look at the code:
$ cd /some/place/for/source $ apt-get source python3-dbus […] dpkg-source: info: extracting dbus-python in dbus-python-1.3.2 […] $ cd dbus-python-1.3.2/
You will see that the python source is in a subdirectory called dbus. Let's see if that talks about our property name:
$ find . -name "*.py" | xargs grep _DBUS_SESSION_BUS_ADDRESS $
No[1]. Interestingly, there's no mention of X11 either. Digging a bit deeper, however, I found a C module dbus_bindings next to the python code in dbus. While it does not contain promising strings (X11, property, SESSION_BUS…) either, that lack made me really suspicious, since at least the environment variable name should really be visible in the source. The answer is in the package's README: “In addition, it uses libdbus” – so, that's where the connection is being made?
Another Red Herring
That's a fairly safe bet. Let's make sure we didn't miss something in the libdbus changelog:
$ zless /usr/share/doc/libdbus-1-3/changelog.Debian.gz
You will have a déjà-vu if you had a look at dbus-x11's changelog above: the two packages are built from the same source and hence share a Debian changelog. Anyway, again there are no suspicious entries. On the contrary: An entry from September 2023 (red-hot by Debian stable standards!) says:
dbus-user-session: Copy XDG_CURRENT_DESKTOP to activation environment. Previously this was only done if dbus-x11 was installed. This is needed by various freedesktop.org specifications…
I can't say I understand much of what this says, but it definitely doesn't look as if they had given up on X11 just yet. But does that library still contain the property names?
$ dpkg -L libdbus-1-3 […] /lib/i386-linux-gnu/libdbus-1.so.3 […] $ strings /lib/i386-linux-gnu/libdbus-1.so.3 | grep SESSION_BUS DBUS_SESSION_BUS_ADDRESS $
No, it doesn't. That's looking like a trace of evidence: the name of the environment variable is found, but there's nothing said of the X11 property. If libdbus evaluated that property, it would stand to reason that it would embed its name somewhere (though admittedly there are about 1000 tricks with which it would still do the right thing without the literal string in its binary).
Regrettably, that's another red herring. Checking the libdbus from the package in bullseye (i.e., the Debian version before bookworm) does not yield the property name either, and there purple-remote did work with just a DISPLAY.
Time For The Git Log
At this point was starting to dispair. But I did not want to give up without having at least had a look at the upstream sources. apt info libdbus-1-3 says the homepage of the package is at https://dbus.freedesktop.org/, and from there getting to the git repository of both libdbus and dbus-launch-x11 is just one click.
Then getting the URL from which to git clone is harder, because gitlab's browser interface still is broken when you forbid Javascript local storage (it's been that way for at least three years). I gave the freedesktop gitlab the privileges it so desired and that way learned I had to say:
git clone https://gitlab.freedesktop.org/dbus/dbus.git
In there, I started with the changelog again, except this time I could use the commit messages (git log). I was looking for SESSON_BUS in there and again did not find anything talking about X11 atoms within the last few years. But let's see whether in 2016 (older than the bullseye package for sure) there was more code talking about our X11 atom; I picked a git commit id from that time more or less randomly:
$ git checkout d64d5f996b6146905b665cac9aa02da003ca1346 $ find . -name "*.c" | xargs grep _DBUS_SESSION_BUS_ADDRESS […]
Oh dang: the only match is dbus-launch-x11.c again. Even in 2016.
The conclusion that it's not libdbus becomes almost inevitable. Let's try it with an example that comes with the python3-dbus package; I'm using one with “server” in its name because that probably will not expect another service to work:
$ cd /usr/share/doc/python3-dbus/examples $ export -n DBUS_SESSION_BUS_ADDRESS $ python3 example-service.py Running example service. [...] $ export -n DISPLAY $ python3 example-service.py dbus.exceptions.DBusException: org.freedesktop.DBus.Error.NotSupported: Unable to autolaunch a dbus-daemon without a $DISPLAY for X11
That last message is a bit funky, but it's pretty clear that libdbus still can pull the D-Bus address from the X display just fine, or it wouldn't have worked without the environment variable. Oh my, I've been on the wrong track all along. It seems it's something within purple-remote specifically. Skimming its changelogs (of the libpurple-bin package as per the dpkg -S trick above) does not yield anything either. Should I now dig into its source?
Aw, I had lost my patience; perhaps someone else wants to take it from here? Meanwhile, if purple-remote insists on getting the D-Bus socket path from the environment variable DBUS_SESSION_BUS_ADDRESS – that's why it works within the X11 session:
$ purple-remote getstatus offline export -n DBUS_SESSION_BUS_ADDRESS $ purple-remote getstatus No existing libpurple instance detected.
– then that's what I'll give it, and forget that once DISPLAY has been enough.
I'm afraid this whodunnit will end in the second most unsatisfying way conceivable: we don't learn who the real culprit is, but at least the wrongs are made right.
If You Can't Fix It, Work Around It
For that, I just need to get that session bus address into my jabber script. From there, peeking into the X session's environment is not easy (well, there's /proc/<pid>/environment, which root could read, but guessing a pid that would see the bus path is not trivial). So, let's instead use a file in a known location; while pranksters will find the socket names in /tmp anyway (and can't do much with them), let's be defensive and put that file into the ~/.ssh directory, because that usually has rather tight permissions.
In practice, I amended my .xinitrc with something like this:
export $(dbus-launch --exit-with-x11) echo "$DBUS_SESSION_BUS_ADDRESS" > ~/.ssh/session-bus-address
If you (as is likely) do not use an .xinitrc any more, any shell script executed during desktop startup will do for the echo; the dbus-launch – which is a lot more critical because it needs to manipulate the session's environment – is almost certainly covered by your desktop environment already.
After that, I can read this address in my /etc/network/scripts/jabber script from the March post. That now looks like this:
#!/bin/sh # State management of pidgin DESKTOP_USER=your-user-name JABBER_CONTROL="DBUS_SESSION_BUS_ADDRESS=`cat /home/$DESKTOP_USER/.ssh/session-bus-address` purple-remote" case $MODE in start) case $IFACE in eth* | wlan* | wwan* | n900 | o2) if ! date +'%w/%H' | grep '[1-5]/\(0[789]\|1[0-9]\)' > /dev/null; then exit 0 fi su - $DESKTOP_USER -c "$JABBER_CONTROL setstatus?status=available" || exit 0 ;; esac ;; stop) case $IFACE in eth* | wlan* | wwan* | n900 | o2) if [ toffline != "t`su $DESKTOP_USER -c "$JABBER_CONTROL getstatus"`" ]; then su - $DESKTOP_USER -c "$JABBER_CONTROL setstatus?status=offline" || exit 0 sleep 0.5 fi ;; esac ;; esac
It's a bit annoying that I have not really found out why purple-remote fails to pick up the session bus from the X server. But then stopping these kinds of investigations before the plot becomes either depressing or boring is the secret of administering modern Linux systems. For better or worse.
[1] | Many people these days replace this find | xargs grep pattern by some recursive grep, where I think ripgrep is the most popular option. One benefit: It works with file names with blanks in them out of the box (with find and xargs, you'll have to pass -print0 and -0, respectively, to enable that). Me – I still like the flexiblity of assembling the small tools, and I particularly like the second-order nature of xargs… but I digress[2]. |
[2] | Then again, isn't digression the privilege of footnotes? |
Zitiert in: OpenSSL, Syslog, and Unexpected Consequences of Usrmerge: Upgrading to bookworm Work-Life Balance and Privacy with Bash, D-Bus, gajim and ifupdown