Tag Debian

  • Another Bookworm Regression: D-bus, X11 Displays, purple-remote, Oh My!

    When I reported on what broke when I upgraded to Debian bookworm, I overlooked that my jabber presence management (where I'm offline at night and on weekends) no longer worked. Figuring out why and fixing it was a dive into D-Bus and X11 that may read like a noir detective novel, at least if you are somewhat weird. Let me write it up for your entertainment and perhaps erudition.

    First off, against the March post, I have migrated to pidgin as my XMPP (“jabber”) client; at its core, presence management still involves a script in /etc/network/if-*.d where I used to call something like:

    su $DESKTOP_USER -c "DISPLAY=:0 purple-remote getstatus"
    

    whenever a sufficiently internetty network interface went up or down, where DESKTOP_USER contains the name under which I'm running my X session (see below for the whole script with the actual presence-changing commands).

    Purple-remote needs to run as me because it should use my secrets rather than root's. But it was the DISPLAY=:0 thing that told purple-remote how to connect to the pidgin instance to interrogate and control. As most boxes today, mine is basically a single-user machine (at least as far as “in front of the screen” goes), and hence guessing the “primary” X display is simple and safe.

    Between X11 and the D-Bus

    That purple-remote needed the DISPLAY environment variable was actually almost a distraction from the start. There are many ways for Unix programs to talk to each other, and DISPLAY might have pointed towards 1980ies-style X11 inter-client communication. But no, the purple-remote man page alreads says:

    This program uses DBus to communicate with Pidgin/Finch.

    Correctly spelled D-Bus, this is one of the less gruesome things to come out of the freedesktop.org cauldron, although it is still riddled with unnecessarily long strings, unnecessarily deep hierarchies, and perhaps even unnecessary use of XML (though I feel sympathies in particular for that last point).

    But that's not what this post is about. I'm writing this because after upgrading to Debian bookworm, purple-remote no longer worked when used from my if-up.d script. Executing the command in a root shell (simulating how it would be called from ifupdown) showed this:

    # DESKTOP_USER=anselm su $DESKTOP_USER -c "DISPLAY=:0 purple-remote getstatus"
    No existing libpurple instance detected.
    

    A quick glance at the D-Bus Specification gives a hint at how this must have worked: dbus-launch – which is usually started by your desktop environment, and my case by a:

    export $(dbus-launch --exit-with-x11)
    

    in ~/.xinitrc – connects to the X server and leaves a “property” (something like a typed environment variable attached to an X11 window) named _DBUS_SESSION_BUS_ADDRESS in, ah… for sure the X server's root window [careful: read on before believing this]. As the property's value, a D-Bus client would find a path like:

    unix:path=/tmp/dbus-1cAbvsX6FD,guid=795a0d...
    

    and it could open that socket to talk to all other D-Bus clients started within the X session.

    Via apropos to xprop to Nowhere

    So… Does that property exist in the running X server? Hm. Can I figure that out without resorting to C programming? Let's ask the man page system:

    $ apropos property
    [..lots of junk...]
    xprop (1)            - property displayer for X
    [...]
    

    Typing in man xprop told me I was on the right track:

    $ man xprop
    
    SYNOPSIS
         xprop  […] [format [dformat] atom]*
    
    SUMMARY
      The xprop utility is for displaying window and font properties in an
      X server.
    
    OPTIONS
      […]
      -root   This argument specifies that X's root window is the target win‐
              dow.   This  is  useful  in situations where the root window is
              completely obscured.
    

    So, let's see:

    $ xprop -root _DBUS_SESSION_BUS_ADDRESS
    _DBUS_SESSION_BUS_ADDRESS:  not found.
    

    Hu? Has dbus-launch stopped setting the property? Let's inspect Debian's change log; a major change like that would have to be noted there, wouldn't it? Let's first figure out which package to look at; the documentation then is in /usr/share/doc/<packagename>:

    $ dpkg -S dbus-launch
    dbus-x11: /usr/bin/dbus-launch
    $ zless /usr/share/doc/dbus-x11/changelog.Debian.gz
    

    Looking for “property” or “BUS_ADDRESS” in there doesn't yield anything; that would make it unlikely that the property was somehow dropped intentionally. I have to admit I had halfway expected that, with something like “for security reasons”. But then if someone can read your root window's properties, access to your session bus is probably the least of your problems.

    Still, perhaps someone is slowly dismantling X11 support on grounds that X11 is kinda uncool? Indeed, you can build dbus-launch without X11 support. If the Debian maintainers built it that way, the respective strings should be missing in the binary, but:

    $ strings `which dbus-launch` | grep _DBUS_SESSION
    _DBUS_SESSION_BUS_PID
    _DBUS_SESSION_BUS_ADDRESS
    _DBUS_SESSION_BUS_SELECTION_
    

    No, that's looking good; dbus-launch should still set the properties.

    Skimming the Docs is Not Reading the Docs.

    If I did not see the property a moment ago, perhaps I have used xprop the wrong way? Well, actually: I didn't read the D-Bus spec properly, because what it really says is this:

    For the X Windowing System, the application must locate the window owner of the selection represented by the atom formed by concatenating:

    • the literal string "_DBUS_SESSION_BUS_SELECTION_"
    • the current user's username
    • the literal character '_' (underscore)
    • the machine's ID

    – and then find the _DBUS_SESSION_BUS_PID on the window owning that selection. The root window thing was my own fantasy.

    If you bothered to skim the ICCCM document I linked to above, you may recognise the pattern: that's just conventional X inter-client communication – no wonder everyone prefers D-Bus.

    This is beyond what I'd like to do in the shell (though I wouldn't be surprised if xdotool had a hack to make that feasible). I can at least establish that dbus-launch still produces what the spec is talking about, because the “atoms” – a sort of well-known string within the X server and as a concept probably part of why folks are trying to replace X11 with Wayland – are all there:

    $ xlsatoms | grep DBUS
    488   _DBUS_SESSION_BUS_SELECTION_anselm_d162...
    489   _DBUS_SESSION_BUS_ADDRESS
    490   _DBUS_SESSION_BUS_PID
    

    The Next Suspect: libdbus

    Given that, dbus-launch clearly is exonerated as the thing that broke. The next possible culprit is purple-remote. It turns out that's a python program:

    $ grep -i dbus `which purple-remote`
    import dbus
        obj = dbus.SessionBus().get_object("im.pidgin.purple.PurpleService", "/im/pidgin/purple/PurpleObject")
    purple = dbus.Interface(obj, "im.pidgin.purple.PurpleInterface")
                data = dbus.Interface(obj, "org.freedesktop.DBus.Introspectable").\
    

    So, this is using the python dbus module. Let's see if its changelog says anything about dropping X11 support:

    $ zless /usr/share/doc/python3-dbus/changelog.Debian.gz
    

    Again, nothing for X11, property, or anything like that. Perhaps we should have a brief look at the code:

    $ cd /some/place/for/source
    $ apt-get source python3-dbus
    […]
    dpkg-source: info: extracting dbus-python in dbus-python-1.3.2
    […]
    $ cd dbus-python-1.3.2/
    

    You will see that the python source is in a subdirectory called dbus. Let's see if that talks about our property name:

    $ find . -name "*.py" | xargs grep _DBUS_SESSION_BUS_ADDRESS
    $
    

    No[1]. Interestingly, there's no mention of X11 either. Digging a bit deeper, however, I found a C module dbus_bindings next to the python code in dbus. While it does not contain promising strings (X11, property, SESSION_BUS…) either, that lack made me really suspicious, since at least the environment variable name should really be visible in the source. The answer is in the package's README: “In addition, it uses libdbus” – so, that's where the connection is being made?

    Another Red Herring

    That's a fairly safe bet. Let's make sure we didn't miss something in the libdbus changelog:

    $ zless /usr/share/doc/libdbus-1-3/changelog.Debian.gz
    

    You will have a déjà-vu if you had a look at dbus-x11's changelog above: the two packages are built from the same source and hence share a Debian changelog. Anyway, again there are no suspicious entries. On the contrary: An entry from September 2023 (red-hot by Debian stable standards!) says:

    dbus-user-session: Copy XDG_CURRENT_DESKTOP to activation environment. Previously this was only done if dbus-x11 was installed. This is needed by various freedesktop.org specifications…

    I can't say I understand much of what this says, but it definitely doesn't look as if they had given up on X11 just yet. But does that library still contain the property names?

    $ dpkg -L libdbus-1-3
    […]
    /lib/i386-linux-gnu/libdbus-1.so.3
    […]
    $ strings /lib/i386-linux-gnu/libdbus-1.so.3 | grep SESSION_BUS
    DBUS_SESSION_BUS_ADDRESS
    $
    

    No, it doesn't. That's looking like a trace of evidence: the name of the environment variable is found, but there's nothing said of the X11 property. If libdbus evaluated that property, it would stand to reason that it would embed its name somewhere (though admittedly there are about 1000 tricks with which it would still do the right thing without the literal string in its binary).

    Regrettably, that's another red herring. Checking the libdbus from the package in bullseye (i.e., the Debian version before bookworm) does not yield the property …

  • How to Pin a Wifi Access Point in Debian – and Why You Probably Don't Want to in Lufthansa Planes

    A vertical gradient from black to light blue, lots of unfilled template variables in double curly braces in white.

    That's what you see in Lufthansa's onboard wifi when you don't let just about anyone execute client-side Javascript on your machine. See below for a more useful URI in the onboard wifi.

    I have already confessed I was flying recently (albeit only in German). What was new versus the last time I've been in a plane five years ago[1]: Not only did wifi signals apparently no longer confuse the aircraft's navigation systems but there was actually an onboard wifi network with no less than seven access points within my machine's range.

    Somewhat surprisingly, I had a hard time getting a connection that would not break after a few seconds. I'll confess that's not the first time I've had trouble connecting to fancy networks recently, where the syslog contained cryptic messages like:

    kernel: wlan0: deauthenticated from <redacted> (Reason: 252=<unknown>)
    kernel: wlan0: disassociated from <redacted> (Reason: 1=UNSPECIFIED)
    

    In all these cases, there were a lot of access points with the same ESSID around, and so I suspect whatever selects the access points is currently broken on my machine; it chooses really weak access points and then gets badly mangled signals. While I'm waiting for this to heal by itself, I am resorting to manually picking and pinning the access points. In case you use ifupdown to manage your wifi, perhaps this little story is useful for you, too.

    The first part is to pick an access point. To do that, I ignore the warning of the authors of iw (from the eponymous package) not to parse its output and run:

    sudo iw wlan0 scan | egrep "^BSS|signal: .*dBm|SSID:"
    

    Nachtrag (2023-11-02)

    Well, in non-plane situations it's wise to get the SSIDs, too, so you see which APs actually are for the network you want to join. Hence, I've updated the grep in the command line above.

    The output of this looked like this on the plane I was in:

    BSS 00:24:a8:83:37:93(on wlan0)
            signal: -68.00 dBm
    BSS 00:24:a8:ac:1d:93(on wlan0)
            signal: -41.00 dBm
    BSS 00:24:a8:83:37:82(on wlan0)
            signal: -62.00 dBm
    BSS 00:24:a8:ac:1d:82(on wlan0)
            signal: -48.00 dBm
    BSS 00:24:a8:83:37:91(on wlan0)
            signal: -60.00 dBm
    BSS 00:24:a8:83:76:53(on wlan0)
            signal: -76.00 dBm
    BSS 00:24:a8:83:77:e2(on wlan0)
            signal: -82.00 dBm
    

    The things after the “BSS” are the MAC addresses of the access points, the numbers after signal is some measure for the power that reaches the machine's antenna[2] from that access point, where less negative means more power. So, with the above output you want to pick the access point 00:24:a8:ac:1d:93.

    With ifupdown, you do that by editing the stanza for that Wifi and add a wireless-ap line; for me, this then looks like:

    iface roam inet dhcp
      wireless-essid Telekom_FlyNet
      wireless-ap 00:24:a8:ac:1d:93
    

    – and this yields a stable connection.

    I must say, however, that the services on that network (I'm too stingy for actual internet access, of course) are a bit lacking, starting with the entirely botched non-Javascript fallback (see above). At least there is http://services.inflightpanasonic.aero/inflight/services/flightdata/v1/flightdata where you will see some basic telemetry in JSON. Or wait: it's actually perimetry if you see speed, height, and other stuff for the plane you're on.

    Fetching the numbers from the json you will save a lot of power versus the web page that becomes extremely network-chatty and CPU-hoggy (on webkit, at least) once you let Lufthansa execute Javascript. I'm afraid I have too much flight shame (and hence too little use for it) to cobble something nice together with that API and qmapshack. But it certainly looks like a fun project.

    [1]Ah wait… now that I think again, I seem to remember that during one of my last sinful travels there has already been a plane that had on-board Wifi. But it certainly is a nicer story with the little lie of news when coming back after five years.
    [2]Since “dBm” stands for „decibel milliwatt“, you could compute that power as 10s ⁄ 10  W. I'd not trust the absolute numbers, as they would indicate here that one access point is a factor of ten thousand stronger than another one, which sounds implausible primarily because I'd be surprised if the circuitry of the Wifi card could deal with such a high dynamic range. And “I'm getting 0.0001 milliwatts from the AP“ is a statement in dire need of interpretation anyway (e.g., “in the carrier? Bolometric?”). But let's not go there.
  • How to Disable pdf.js in Webkit on Debian

    A window of the zathura PDF viewer showing the GDPR.

    This is how I want my PDFs rendered. And I want a j to scroll down a bit. That pdf.js fails on both accounts is just the first two of its defects.

    When I upgraded to Debian bookworm, I noticed with great dismay that the webkit browser engine it comes with has a pdf.js-based PDF renderer built in.

    That means that my preferred browser, luakit, is basically broken when dealing with PDFs: where I disable Javascript (i.e., by default), I see nothing at all. Where I allow Javascript, my PDFs appear in a UI I consider rather nasty. On top of that, I lose the nice archive of PDFs I've recently read that came with luakit's viewpdf extension. That holds true even if I do manage to properly open the PDF in my preferred renderer (zathura) using pdf.js's Save, as that blindly calls all PDFs “document.pdf”.

    Regrettably, there doesn't seem to be a runtime switch to turn off the in-browser PDF rendering. After poking around a bit in webkit's source code, I have convinced myself that I won't add that switch myself. I am just not desperate enough to start hacking on one of the major browser engines.

    But there is a build-time switch to turn pdf.js off. I have always shied away from building my own webkit packages because there's so horribly much code and C++ compilers are so terribly resource-hungry. But my suffering with the pdf.js disaster has reached a level that made me overcome that horror. So, here's how to build a Webkit such that browsers based on it will again handle PDFs properly (sc. by handing them over to the system). All this is for Debian bookworm and derivatives; let's hope it won't be necessary beyond that.

    1. Get the source:

      mkdir -p src/webkit
      cd src/webkit
      apt-get source webkit2gtk
      cd webkit2gtk*
      

      This will only work if you have configured a source repo for your suite in your /etc/apt/sources.list (or equivalent) and run apt update after that.

      This is pulls in about 50 Megabytes, which in itself is an argument in favour of netsurf. But these 50 Megs are peanuts compared to what's coming: by the time you've done a full build, this directory will have exploded into more than 3 GB (in i386). Let's fix the web so plain browsing doesn't require such monsters.

    2. Configure your build. Fortunately, you mostly only touch the debian/rules file. In there, change:

      ENABLE_SOUP2=YES
      ENABLE_SOUP3=YES
      ENABLE_GTK4=YES
      

      to (presumably):

      ENABLE_SOUP2=YES
      ENABLE_SOUP3=NO
      ENABLE_GTK4=NO
      

      That's for luakit that is built on top of soup2; if your browser uses a different API, make a different choice here. Each build takes forever and gobbles up about 3 Gigs in the process, so be stingy here.

      Then, locate the line -DENABLE_MINIBROWSER=ON (which currently concludes the EXTRA_CMAKE_ARGUMENTS) and change it to:

      -DENABLE_MINIBROWSER=ON \
      -DENABLE_PDFJS=OFF \
      -DENABLE_JOURNALD_LOG=OFF
      

      Disabling the journald log is not strictly necessary, but it helps building on non-systemd boxes, and I doubt it actually hurts anyone.

      Nachtrag (2024-01-21)

      At least with 2.40.3, this procedure ends in a:

      dh_install: error: missing files, aborting
      

      presumably because we are not building for two APIs. I think that's a bug, but from dh_install's manpage I cannot even understand why it thinks it should fail because of missing files, and consequently futzing around with debian/not-installed or the various options went nowhere. Because I'm really grumpy with the whole state of affairs, I quickly resigned into simply emptying all debian/*.install files not pertinent to the packages I want to build.

    3. Remove the systemd build dependency. We can do that because we have just disabled the JOURNALD_LOG. So, in debian/control, delete the line:

      libsystemd-dev [linux-any],
      
    4. Install the build dependencies:

      sudo apt-get build-dep webkit2gtk
      

      On non-systemd boxes, this will say something like:

      libelogind0 : Conflicts: libsystemd0
      

      because you have not removed the libsystemd dependency from apt's database in step (3), and webkit at this point doesn't know it could build with libelogind0-dev, too. Don't worry about it as long as all the other build-dependencies came in.

    5. Make a changelog entry so your system knows your build is “newer” than Debian's and you can later tell it's your custom build:

      dch -i
      

      You probably want to put something like “rebuild with PDFJS disabled“ in there, but that's exclusively for your own comfort unless you start distributing your package.

    6. Do the build:

      dpkg-buildpackage -j6 -b -uc -us -rfakeroot
      

      Do that on a cold day, because this will turn your machine into a space heater for several hours (unless you have a very fast machine, in which case you probably don't need another space heater in the first place).

    7. When this is done, you will have about a dozen binary packages in the build directory's parent. You probably don't want to dpkg -i *.deb, as there's no point installing debug packages (for starters). For luakit, I've run this:

      sudo dpkg -i gir1.2-javascriptcoregtk-4.0_2.*.deb gir1.2-webkit2-4.0_2.*.deb libjavascriptcoregtk-4.0-18_2.*.deb libjavascriptcoregtk-4.0-bin_2.*.deb libjavascriptcoregtk-4.0-dev_2.*.deb libwebkit2gtk-4.0-37_2.*.deb libwebkit2gtk-4.0-dev_2.*.deb
      

      This could be a slight over-installation.

    By the way, in case the build fails somewhere in the middle but is fundamentally sane, you can resume it by calling:

    fakreroot debian/rules binary
    

    Doing dpkg-buildpackage as above resets the build and will discard everything the computer has built in perhaps hours.

    Given the extreme cost of building a webkit, getting pdf.js out in this way is not a long-term plan, at least if you want your webkit to be halfway up-to-date (which is a good idea in particular if you're indiscriminate as to who can execute Javascript in your browser). Until someone kindly implants a run-time switch, I'm going to shut out pdfjs-infested upgrades until some really, really unnerving (that is, even more unnerving than usual) webkit vulnerability surfaces. To do that, I'm dropping:

    # my webkit with patched-out pdfjs
    Package: libjavascriptcoregtk-4.0-18
    Pin: version 2.40.5-1~deb12u1.1
    Pin-Priority: 1001
    

    into /etc/apt/preferences.d/10pins (where your version will probably different; check the version tag in the names of the generated package files). That will make the messages from apt upgrade quite a bit uglier, and of course I'll have a webkit with published security bugs (you have been warned in case you're doing as I do). But in my book that's totally worth it just to get rid of the wretched pdf.js.

  • OpenSSL, Syslog, and Unexpected Consequences of Usrmerge: Upgrading to bookworm

    A few weeks after the release of Debian bookworm, I have recently dist-upgraded my main, ah well, workstation, too. As mentioned in my bullseye upgrade post, that box's file system is ancient, and the machine does many things in perhaps unusual ways, which includes still booting with sysvinit rather than systemd for quite a few reasons. Hence, it always brings up the some interesting upgrade probl^H^H^H^H^Hchallenges. While for bullseye, the main… um… challenge for me was the migration to python3, this time the big theme was dropped crypto engines.

    Rsyslogd, wmnet

    Much more harmless than those, but immediately visible after the upgrade, was that my syslog display remained empty. The direct reason was that the rsyslog daemon was not running. The reason for that, in turn, was that there was not even a init script for it in /etc/init.d, let alone rc.d links to it. But the rsyslogd package was installed. What would the purpose be of installing a daemon package without an init script?

    The Debian bug tracker had something like an answer: the maintainer took it out, presumably to shed files they considered cruft in the age of systemd. Although I have to concur with Naranyan's remark in the bug report that rsyslog will typically be in place exactly when systemd (with its own log daemon) is not, at least that bug (#1037039) offers the (simple) fix: Install the orphan-sysvinit-scripts package.

    Something a bit harder to explain is that the nice wmnet applet for monitoring transfers on network interfaces came up blank after the upgrade. This is fixed by passing a -n option to it, which tells it to draw into a normal window rather than something suitable for the Windowmaker dock. Wmnet (as perhaps other Windowmaker applets, too) tries to guess where to draw based on some divination. Perhaps my window manager sawfish started to pretend it's Windowmaker in bookworm? Or indicated to wmnet in some other way it was living in a Windowmaker dock? Hm: Given that the last changelog entry on sawfish itself is from 2014 (which I consider a sign of quality), that seems unlikely, but then I can't bring myself to investigate more closely.

    The usr Merge and Bashism in the Woodwork

    Although I had merged the root and usr file systems on that box last time I migrated to a new machine, I had postponed doing the usrmerge thing (i.e., making the content of /bin and /usr/bin identical) on the box until the last possible moment – that is, the bookworm installation – because I had a hunch some hack I may have made 20 years ago would blow up spectacularly.

    None did. Except… it turned out I had linked /bin/sh to /bin/bash for some long-forgotten and presumably silly reason; if you had asked me before the upgrade, I'd have confidently claimed that of course all my little glue scripts are executed by Debian's parsimonious dash rather than the relatively lavish bash. Turns out: they weren't.

    With the installation of the usrmerge package during the bookworm dist-upgrade that is over. /bin/sh is now dash as I had expected it to be all the time. I have to admit I am a bit disappointed that I do not notice any difference in system snappiness at all.

    But I did notice that plenty of my scripts were now failing because they contained a bashism: Comparison for string equality in POSIX-compliant [ ... ] constructs is not the C-like == but the SQL-like = even though bash accepts both. I don't know when I forgot this (or, for that matter, whether I ever knew it), but a dozen or so of my (often rather deeply embedded) shell scripts started to fail with messages like:

    script name: 22: [: tonline: unexpected operator
    

    So, repeat after me: In shell scripts, compare strings with = and numbers with -eq. And I have to admit that this experience made me a bit more sympathetic to the zero shell paradigm behind systemd. But for the record: I still think the advantages of having hooks for shell scripts almost everywhere overall outweigh these little annoyances.

    The OpenSSL Upgrade

    With the bookworm upgrade, a fair number of hashes and ciphers were declared “legacy” in openssl, which means that in the default configuration, it will reject them. That had a few rather disruptive consequences: For one, I needed to update a few snake-oil certificates I had generated for playing with https on my box.

    Also, fetchmail failed for a POP server I had configured with a message like:

    fetchmail: <hostname> SSL connection failed.
    fetchmail: socket error while fetching from <whatever>
    

    I was puzzled for a while until I realised that the recipe said:

    with proto TLS1
    

    That was probably valuable in, like, 2004, to suppress ancient (relatively) easily breakable SSL versions, but by now it didn't let fetchmail negotiate crypto that was still allowed by openssl. Removing the proto TLS1 fixed that problem.

    The most unnerving breakage, however, was that my preferred disk crypto, encfs (cf. this advocacy in German), broke for some volumes I had created more than a decade ago: they failed to mount because openssl now refuses (I think) the blowfish cipher. I fiddled around a bit with re-enabling legacy algorithms as per Debian bug 1014193 but quickly lost my patience with the slightly flamboyant semantics of openssl.cnf. To my surprise, downgrading to encfs_1.9.5-1+b2_i386.deb from bullseye (by briefly re-adding the sources.list lines) let me mount the old volumes again. I then simply created new encfs volumes and rsync -av-ed from the old decrypted volume into the new decrypted volume. Finally, after unmounting everything encfs, I overwrote the old encrypted volumes with the new encrypted volumes and upgraded back to bookworm encfs.

    Since I can't explain why downgrading encfs would have fixed the problem as I've analysed it and hence suspect that a part of my analysis (and fix) is wrong, I'd strongly recommend to run:

    encfsctl info <encrypted volume>
    

    on each encfs directory you have before the upgrade. If it says something like:

    Filesystem cipher: "ssl/blowfish", version 2:1:1 (using 3:0:2)
    

    or even just:

    Version 5 configuration; created by EncFS 1.2.5 (revision 20040813)
    

    (where I have not researched the version where encfs defaults became acceptable for bookworm openssl; 1.9 is ok, at any rate), copy over the decrypted content into a newly created encfs container; it's quick and easy.

    Relatedly, bookworm ssh also disallows a few crypto methods now deemed insecure by default, in particular SHA-1 hashes for host keys. Now, I have to connect to a few hosts I cannot upgrade (either because I'm not root or because they are stuck on some ancient kernel because of proprietary kernel components). For these, when trying to connect I now get messages like this:

    Unable to negotiate with 192.168.20.21 port 22: no matching host key type found. Their offer: ssh-rsa,ssh-dss
    

    You could reasonably argue I should discard boxes of that type. On the other hand, nobody will spend 50'000 Euro to eavesdrop on my communications with these machines[1] – that's the current estimate for producing a hash collision for an ssh host key, which this is about. Hence, I'm happy to risk man-in-the-middle attacks for these machines.

    To deal with such situations, openssh lets you selectively re-allow SHA-1 hashes on RSA host keys. Helpfully, /usr/share/doc/openssh-client/NEWS.Debian.gz gives a recipe to save those hosts; put host stanzas like:

    Host ancient-and-unupdatable.some.domain
      HostKeyAlgorithms=+ssh-rsa
      PubkeyAcceptedKeyTypes +ssh-rsa
    

    into ~/.ssh/config (and do read ssh_config (5) if you are not sure what I'm talking about, regardless of whether or not you have this particular problem). Incidentally, just to save that one machine where you forgot to update your ancient DSA public key, you can for a brief moment change the second line to:

    PubkeyAcceptedKeyTypes +ssh-rsa,ssh-dsa
    

    If you don't have an RSA key yet, create one (ssh-genkey -t rsa) – RSA keys work even on the most venerable openssh installations that don't yet know about the cool ed25519 keys. Connect to the server, install the RSA public key, and re-remove the ssh-dsa part in the config again.

    Kudos to the openssh maintainers for keeping compatibility even in crypto over more than 20 years. And shame on many others – including me – who don't manage to do that even in non-crypto software.

    Terrible Font Rendering in Swing

    One of the more unexpected breakages after the upgrade was that some Java Swing (a once-popular GUI toolkit) applications suddenly had terribly jagged fonts, such as my beloved TOPCAT:

    Part of a screenshot of a menu with horribly jaggy letters

    I cannot exactly say why this looks so terrible[2]. Perhaps in the age of 300 dpi displays font hinting – which is supposed to avoid overly jagged pixelisation when rendering vector fonts at low resolutions – has become out of fashion, perhaps OpenJDK now …

  • Fixing “libqca-ossl is missing”

    In all honesty, I don't expect many people who might profit from this post will ever see the message below. But since common web searches don't yield anything for it (yet), I figure I should at least document it can happen. I also want to praise kwallet's author(s) because whatever went wrong yielded what turned out to be a rather useful error message rather than a spectacular crash:

    createDLGroup failed: maybe libqca-ossl is missing
    

    Here's what lead up to it: in Debian bookworm, my old Mastodon client tootle started crashing when viewing images. Its development has moved to a new client called Tuba, and even though that is not packaged yet I figured I might as well move on now rather than fiddle with tootle. Tuba, however, needs a password manager more sophisticated than the PGP-encrypted text file I use otherwise. So I bit the bullet and installed kwalletmanager; among the various password managers, it seemed to have the most reasonable dependencies.

    With that, Tuba can do the oauth dance it needs to be able to communicate with the server. But when it tries to save the oauth token it gets from the Mastodon instance, I got the error message above. Tuba can still talk to the the server, but once the session is over, the oauth token is lost, and the next time I start Tuba, I have to do the oauth dance again.

    Fixing the error seemed simple enough:

    $ apt-file search libqca-ossl
    libqca-qt5-2-plugins: /usr/lib/i386-linux-gnu/qca-qt5/crypto/libqca-ossl.so
    $ sudo apt install libqca-qt5-2-plugins
    

    – as I said: kwallet's is a damn good error message. Except the apt install has not fixed the problem (which is why I bother to write this post). That's because kwalletmanager starts a daemon, and that daemon is not restarted just because the plugins are installed.

    Interestingly, just killing that daemon didn't seem to fix the problem; instead, I had to hit “Close“ in kwalletmanager explicitly and then kill the daemon (as in killall kwalletd):

    Screenshot: kdewallet with a close button and two (irrelevant) tabs.

    I give you that last part sounds extremely unlikely, and it's possible that I fouled something up the first time I (thought I) killed kwalletd. But if you don't want to do research of your own: Just hit Close and relax.

    You could also reasonably ask: Just what is this “ossl” thing? Well… I have to admit that password wallets rank far down in my list of interesting software categories, and hence I just gave up that research once nothing useful came back when I asked Wikipedia about OSSL.

  • Browsing Peace and Privacy With dnsmasq

    Screenshot of the dnsmasq extra configuration page in freetz

    You can even have the DNS-based adblocking discussed here in your whole network if your router runs dnsmasq (it probably does) and you can edit its configuration (you probably can't). As shown here, with freetz you can.

    I'm not a big fan of in-browser adblocking. For one, I have my doubts about several of the extensions – Adblock plus, for instance, comes from a for-profit, though I give you this critique might be partisan. Also, I like to switch browsers freely and certainly don't want to maintain block lists for each of them, and finally quite a few clients other than browsers may render HTML and hence ads.

    At least with the pages I want (and don't want) to read, there's a much lighter alternative: DNS-based adblocking. You see, on the relatively few commercial pages I occasionally have reason to visit, ads, tracking pixels, and nasty javascript typically are served from a rather small set of domains – doubleclick.net, googleadservices.com, and a few more like these. If I can make my computer resolve these names to 127.0.0.1 – that is, my computer in IPv4, or yours, if you type that address –, everything your browser would pull from these servers is instantly gone in everything rendering HTML.

    So, how do you do that? Well, you first make sure that your computer does the name resolution itself[1]. On Debian, you do that by installing the packages resolvconf (without a second e; in a systemd environment I think you want to use systemd-resolved instead) and dnsmasq; that's really all, and that ought to work out of the box in all reasonably common situations:

    $ sudo apt install resolvconf dnsmasq
    

    You will probably have to bring your network down and up again for this to take effect.

    Once that's done, you can tell dnsmasq what names to resolve to what. The man page dnsmasq(8) documents what to do under the --address option – you could actually configure dnsmasq through command line options exclusively –, where you can read:

    -A, --address=/<domain>[/<domain>...]/[<ipaddr>]

    Specify an IP address to return for any host in the given domains. […] A common use of this is to redirect the entire doubleclick.net domain to some friendly local web server to avoid banner ads. The domain specification works in the same was [sic, as of bullseye] as for --server […]

    – and from the documentation of --server you learn that <domain> is interpreted as a suffix (if you will), such that if you give an address for, say, google.com, it will also be used for foo.google.com or foo.bar.google.com.

    But where do these address expressions go? Well, at least in Debian, dnsmasq will read (essentially, see the README in there) any file you drop into /etc/dnsmasq.d and add its content to its configuration. Having configuration snippets in different files really helps maintenance and dist-upgrades in general; in this case, it also helps distributing the blacklist, as extra configuration that may be inappropriate on a different host is kept in some other file.

    I tend to prefix snippet names with numbers in case order might one day matter. So, I have a file /etc/dnsmasq.d/10spamreduce.conf containing:

    address=/doubleclick.net/127.0.0.1
    address=/xiti.com/127.0.0.1
    address=/adform.net/127.0.0.1
    address=/qualtrics.com/127.0.0.1
    address=/criteo.com/127.0.0.1
    address=/exactag.com/127.0.0.1
    address=/optimizely.com/127.0.0.1
    address=/googleadservices.com/127.0.0.1
    address=/googletagmanager.com/127.0.0.1
    address=/ivwbox.com/127.0.0.1
    address=/ivwbox.de/127.0.0.1
    address=/connect.facebook.de/127.0.0.1
    address=/facebook.net/127.0.0.1
    address=/facebook.com/127.0.0.1
    address=/addthis.com/127.0.0.1
    address=/update.googleapis.com/127.0.0.1
    address=/googleusercontent.com/127.0.0.1
    address=/edgekey.net/127.0.0.1
    address=/ioam.de/127.0.0.1
    address=/cookiebot.com/127.0.0.1
    address=/moatads.com/127.0.0.1
    address=/fonts.gstatic.com/127.0.0.1
    address=/fonts.googleapis.com/127.0.0.1
    address=/ping.chartbeat.net/127.0.0.1
    address=/cookielaw.org/127.0.0.1
    

    When you do the same thing, you should restart dnsmasq and then see the effect like this:

    $ sudo service dnsmasq restart
    $ dig +short fonts.gstatic.com
    127.0.0.1
    

    As you can see, I have also included some trackers and other sources of annoyance in my address list. Of course, if you actually want to read Facebook (ugh) or need to pull Google's fonts (ughugh), you'll have to adapt that list a bit.

    In case you have interesting and useful contributions to this list: Please do write in!

    [1]Regrettably, with things like DNS over HTTPS, it could be that your browser actually will not use your computer's DNS resolver. Adblocking hence is one extra reason to disable DoH when you see it.
  • Work-Life Balance and Privacy with Bash, D-Bus, gajim and ifupdown

    A small screenshot showing an offline icon

    Sunday morning: my gajim is automatically offline. This post explains how I'm doing that.

    I still consider XMPP the open standard for “chat” (well, instant messaging), and I have been using Psi as an XMPP client for almost 20 years now. However, since Psi has occasionally crashed on me recently (as in: at least since Bullseye), presumably on receiving some message, I consider it a certainty that it is remotely exploitable. Given its large codebase I don't think I want to fix whatever is wrong myself, and I don't think there are still people maintaing Psi.

    I therefore recently migrated to gajim last week; after all, one of the nice things about open standards is that there are usually multiple implementations. This, however, made me update an ancient hack to automatically manage my status so that I'm XMPP-offline when it's nobody's business whether or not my machine is on.

    In this post, I'd like to tell you how that works, hoping it may be useful to solve other (but similar; for instance: get offline when doing talks) problems, too.

    Not Always Online

    First off, the major reason I'm not much of a fan of synchronous messaging (which IM is, and email is not) is that it requires some sort of “presence” notification: something needs to know whether I am online, and where I can be reached. At least in XMPP, additionally all your contacts get to know that, too.[1]

    While I admit that can be useful at times, during the night and on weekends, I really don't want to publish when my computer is on and when it's not. Hence I have so far told my Psi and I am now telling my gajim to not automatically re-connect on Weekends or between 20:00 and 7:00. That I can specify this perhaps somewhat unique preference illustrates how great shell integration everywhere is. The ingredients are:

    • ifupdown, Debian's native network management. If you're using systemd or NetworkManager or something, I think these use other hooks [if you've tried it, let me know so I can update this].
    • D-Bus, a framework to communicate between programs sitting on a common X11 display (though with gajim, D-Bus becomes somewhat hidden).
    • the shell, which lets you write little ad-hoc programlets and duct-tape together all the small utilities that accumulated in Unix since the early 1970ies (here: logger, date, and egrep).

    Inter-Process Communication with D-Bus

    The first thing I want to do is make tajim offline before a network interface goes down. That way, people don't have to wait for timeouts to see I am unavailable (unless someone pulls the cable or the Wifi disappears – without a network, gajim can't sign off). That means I have to control a running gajim from the outside, and the standard way to do that these days is through D-Bus, a nifty, if somewhat over-complicated way of calling functions within programs from other programs.

    One of these other programs is qdbus, which lets you inspect what listens on your sessions's (or, with an option, system's) D-Bus and what functions you can call where. For instance:

    $ qdbus org.gajim.Gajim /org/gajim/Gajim
    ...
    method void org.gtk.Actions.SetState(QString action_name, QDBusVariant value, QVariantMap platform_data)
    ...
    

    In Psi, with a bit of fiddling, a generic D-Bus tool was enough to switch the state. Since there's a QDBusVariant in the arguments gajim's SetState method wants according to the qdbus output, I don't think I could get away with that after the migration – qdbus does not seem to be able to generate that kind of argument.

    Enter gajim-remote

    But gajim comes with a D-Bus wrapper of its own, gajim-remote, and with that, you can run something like:

    gajim_remote change_status offline
    

    Except that won't work out of the box. That's because gajim comes with remote control disabled by default.

    To enable it, go to Preferences → Advanced, click Advanced Configuration Editor there, and then look for the remote_control configuration item. I have no idea why they've hidden that eminently useful setting so well.

    Anyway, once you've done that, you should be able to change your status with the command above and:

    gajim_remote change_status online
    

    ifupdown's Hooks

    I now need to arrange for these commands to be executed when network interfaces go up and down. These days, it would probably be smart to go all the way and run a little daemon listening to D-Bus events, but let me be a bit less high-tech, because last time I looked, something like that required actual and non-trivial programming.

    In contrast, if you are using ifupdown to manage your machine's network interfaces (and I think you should), all it takes is a bit of shell scripting. That's because ifupdown executes the scripts in /etc/network/if-up.d once a connection is up, and the ones in /etc/network/if-down.d before it brings a connection down in a controlled fashion. These scripts see a few environment variables that tell them what's going on (see interfaces(5) for a full list), the most important of which are IFACE (the name of the interface being operated on), and MODE, which would be start or stop, depending on what ifupdown is doing.

    The idea is to execute my change_status commands from these scripts. To make that a bit more manageable, I have a common script for both if-up.d and if-down.d. I have created a new subdirectory /etc/network/scripts for such shared ifupdown scripts, and I have placed the following file in there as jabber:

    #!/bin/sh
    # State management of gajim
    
    DESKTOP_USER=msdemlei
    
    
    case $MODE in
    start)
      case $IFACE in
      eth* | wlan* | n900)
        if ! date +'%w/%H' | grep '[1-5]/\(0[789]\|1[0-9]\)'  > /dev/null; then
          exit 0
        fi
        su - $DESKTOP_USER -c 'DISPLAY=:0 gajim-remote change_status online "Got net"' > /dev/null || exit 0
        ;;
      esac
      ;;
    
    stop)
      case $IFACE in
      eth* | wlan* | n900)
        if [ tonline == "t`su $DESKTOP_USER -c 'DISPLAY=:0 gajim-remote get_status'`" ]; then
          su - $DESKTOP_USER -c "DISPLAY=:0 gajim-remote change_status offline 'Losing network'" || exit 0
          sleep 0.5
        fi
        ;;
      esac
      ;;
    esac
    

    After chmod +x-ing this file, I made symbolic links like this:

    ln -s /etc/network/scripts/jabber /etc/network/if-down.d/
    ln -s /etc/network/scripts/jabber /etc/network/if-up.d/
    

    – and that should bascially be it (once you configure DESKTOP_USER).

    Nachtrag (2023-12-02)

    Let me admit that this never really worked terribly well with gajim, manly because – I think – its connections don't time out, and so once a status update hasn't worked for one reason or another, gajim would be in a sort of catatonic state. That's one of the reasons I switched on to pidgin, and its state management again broke when upgrading to Debian bookworm. My current script is near the bottom of this December 2023 post

    Debugging Admin Scripts

    Because it is a mouthful, let me comment a bit about what is going on:

    logger Jabber: $MODE $IFACE $LOGICAL
    

    logger is a useful program for when you have scripts started deeply within the bowels of your system. It writes messages to syslog, which effectively lets you do printf Debugging of your scripts. Once everything works for a script like this, you probably want to comment logger lines out.

    Note that while developing scripts of this kind, it is usually better to just get a normal shell, set the environment variables (or pass the arguments) that you may have obtained through logger, and then run them interactively, possibly with a -x option (print all statements executed) passed to sh. For instance:

    $ MODE=start IFACE=wlan0 sh -x /etc/network/scripts/jabber
    + DESKTOP_USER=anselmf
    + logger Jabber: start wlan0
    + case $MODE in
    + case $IFACE in
    + date +%w/%H
    + grep '[1-5]/\(0[789]\|1[0-9]\)'
    + exit 0
    

    – that way, you see exactly what commands are executed, and you don't have to continually watch /var/log/syslog (or journalctl if that's what you have), not to mention (for instance) bring network interfaces up and down all the time.

    Case Statments in Bourne's Legacy

    The main control structure in the script is:

    case $MODE in
    start)
      ...
      ;;
    stop)
      ...
      ;;
    esac
    

    Case statements are one of the more powerful features of descendants of the Bourne shell. Read about them in the excellent ABS in case you are a bit mystified by the odd syntax and the critically important ;; lines.

    The particular case construct here is there so I can use the same script for if-up.d and if-down.d: it dispatches on whatever is in MODE. In case MODE is something other than start or stop, we silently do nothing. That is not always a good idea – programs failing without complaints are a major reason for the lack of hair on my head –, but since this isn't really user-callable, it's probably an acceptable behaviour.

    General rule of thumb, though: Be wary of case .. esac without a *) (which gives commands executed when nothing …

  • BahnBonus ohne Google-Id und auf dem eigenen Rechner

    Screenshot: Ein bunter App-Bildschirm mit wenig Information und einem Spendenaufruf.  Es handelt sich um die BahnBonus-App der Bahn.

    Objekt der Begierde: Die BahnBonus-App, die mich wieder in die DB-Lounges Einlass finden wird. Und zwar ganz ohne Apple und nur mit einer einfachen Überdosis Google.

    Vor einem knappen Jahr habe ich eine Großbeichte abgelegt: Ja, ich nehme an so einem blöden, schnüffelnden Kundenbindungsprogramm teil, und dann noch an dem von der Bahn, bei dem VielfahrerInnen gemütlich im Sessel Kakao schlürfen, während gewöhnliche Reisende draußen am Bahnsteig frieren oder sich um die wenigen Sitzgelegenheiten in den Bahnhofsgebäuden streiten müssen: Ehemals bahn.comfort, jetzt BahnBonus.

    Im zitierten Post habe ich dem Lounge-Zugang hinterhergeweint, denn seit einem knappen Jahr lässt die Bahn nur noch Menschen in die Lounges, die ferngewartete Computer („Smartphones“), und dann noch ziemlich neue davon, verwenden. Statt der alten Plastikkarte brauchte es jetzt eine, hust, App. Die tut nun (wie ich jetzt weiß und vorher ahnte) nicht nicht viel mehr als den Login-Bildschirm der Bahn-Webseite anzuzeigen und dann Links auf QR-Codes zu generieren. Wahrscheinlich etwas naiv habe damals gehofft, dass die Bahn die paar Zeilen Javascript, die es dafür braucht, auch auf ihre normale Webseite packt.

    Das ist aber nicht passiert. Als die Bahn neulich BahnBonus-Papierwerbung geschickt hat („Sie haben Gold-Status!”), habe ich erneut eine Mail an den Bahn-Support geschrieben, wie es denn mit QR-Codes auf der Webseite stehe. Erneut war die Antwort ein nicht weiter erklärtes Nein. Dass die Bahn mit der Negativantwort aber etliche Gutscheine (insbesondere zum Lounge-Zugang) schickte, nahm ich als „nee, wir machen das nie, auch wenn es einfach ist“. Mag sein, dass es dabei ums Datensammeln geht, mag sein, dass das einfach Konzernpolitik ist.

    Jedenfalls: Wenn ich wieder im Warmen Kakao schlürfen will, muss ich irgendwie auf Dauer an die QR-Codes kommen. Ferngewartete Computer kommen für mich allenfalls in virtuellen Maschinen in Frage, und so dachte ich mir: Ich probier mal, ob ich die BahnBonus-App nicht auch auf meinem normalen Rechner zum Laufen kriege.

    Stellt sich raus: Das geht, und wenn mensch sich Google in einer VM austoben lässt, sogar mit vertretbarem Aufwand. Ich schreibe hier mal auf, was es für mich gebraucht hat; das mag ja bei anderen Digitalzwängen auch ein wenig helfen.

    Android in QEMU aufsetzen

    Ich gehe für die folgenden Schritte aus von einem Debian (bullseye) auf einem Intel- oder AMD-System, das nicht wesentlich älter ist als 15 Jahre. Im Prinzip dürfte aber fast alles auch auf jeder anderen Plattform gehen, auf der Qemu läuft.

    Wenn ihr bei den folgenden Schritten irgendwo ins Schleudern kommt, lasst es mich bitte wissen – ich erweitere diese Erzählung gerne so, dass sie auch nicht übermäßig nerdigen Menschen etwas sagt.

    (1) Qemu installieren – Qemu ist zunächst ein Emulator von allerlei Hardware. Da aber Android enorm ressourcenhungrig ist (also: jetzt für meine Verhältnisse), wäre alles furchtbar lahm, wenn der Android-Code nicht direkt von der CPU in eurem Rechner ausgeführt würde – ich werde Qemu also als Virtualisierer verwenden und nur in sehr zweiter Linie als Emulator. Achtet jedenfalls darauf, dass qemu KVM machen kann. Zum Ausgleich braucht ihr nur die amd64-Fassung, nicht all die anderen Architekturen, und insbesondere nicht ARM. In Bullseye sollte sowas hier reichen:

    apt install qemu-system-gui qemu-system-amd64
    

    [ich selbst habe an der Stelle aus Geiz qemu-system-x86 genommen; das geht auch, und damit ist alles etwas kompakter].

    (2) Android-x86 besorgen – ich gestehe ehrlich, dass ich mich nicht sehr um die Vertrauenswürdigkeit der Leute rund um den Port von Android auf x86-Prozessoren gekümmert habe. Ich habe einfach ein passendes ISO-Image von deren FOSSHUB-Seite (Krapizität 10 lässt hoffen) runtergeladen; wenn ihr die amd64-Qemu installiert habt, wollt ihr jetzt das „64-bit ISO file“.

    (3) Container fürs Android-Filesystem anlegen – euer Android muss seine Dateien irgendwo hinspeichern, und ihr wollt ihm gewiss keinen Zugriff auf euer echtes Dateisystem geben. Erzeugt also eine „virtuelle“ Festplatte für die Qemu-Daten. Dabei kommt ihr mit einem Gigabyte auch bei i386 nicht aus. Wenn ihr euch um Plattenplatz keine Sorgen macht: baut lieber gleich eine mit vier Gigabyte (4G am Ende der Kommandozeile).

    Sucht euch auch irgendeinen Platz, wo ein Klops von der Größe nicht schlimm stört. Ich nehme hier mal ~/containers (was ihr dann wohl aus eurem Backup rausnehmen solltet):

    mkdir -p ~/containers
    qemu-img create -f qcow2 ~/containers/android.img 2G
    

    Display-Probleme

    Jetzt stellt sich das Problem, dass euer künftiges Android die Bildschirmausgabe irgendwo hinschicken muss. Qemu kann in ein ordinäres X-Fenster ausgeben, aber das ist – aus Gründen, die ich nicht untersucht habe – furchtbar lahm. Was für mich gut funkioniert hat: VNC. Wenn ihr damit nicht zurechtkommt, probiert unten mal QDISPLAY="-display gtk" (könnte halt kreuzlahm sein).

    (4) Android-Installer starten – das braucht ein paar Optionen, damit das Ding auch ins Netz kommt und die beiden nötigen Dateien (die virtuelle Platte und den Android-Installer) findet:

    QDISPLAY="-display vnc=localhost:0"
    qemu-system-amd64 $QDISPLAY -enable-kvm -m 2000 \
      -net nic -net user -drive file=$HOME/containers/android.img,format=qcow2 \
      -boot d -cdrom /media/downloads/android-x86-9.0-r2.iso
    

    Den Pfad in der -cdrom-Option müsst ihr ganz sicher anpassen, damit er auf das ISO zeigt, das ihr gerade runtergeladen habt. Lasst jetzt einen VNC-Client auf localhost:5600 los; ich empfehle in diesen Tagen remmina (aus dem gleichnamigen Debian-Paket).[1]

    (5) Den Android-Container konfigurieren – wählt Installation to Hard disk aus, dann Create/Modify Devices. Ihr kommt in einen guten, alten, textbasierten Partitionierer. Als Disklabel wollt ihr nicht GPT haben (weil das später Ärger mit dem Bootloader GRUB gibt). Der Speicher, den ihr da partitioniert, ist der, den ihr in Schritt 3 angelegt habt. Packt die ganze „Platte“ in eine Partition, sagt Write (keine Sorge, mit den Optionen oben könnt ihr keine Daten von euch kaputtmachen) und dann Quit.

    Ihr kommt dann zurück in den Android-Installer. Nach einem Ok könnt ihr das Filesystem auswählen – nehmt ext4.

    Dann fragt der Installer, ob ihr einen GRUB haben wollt – ja, wollt ihr, sonst kommt euer Android nachher nur mit viel Mühe hoch.

    Das System Directory wollt ihr wahrscheinlich nicht read/write haben (es sei denn, ihr wollt ernsthaft mit dem Android spielen). Das spart einiges an Platz.

    (6) Android ins Netz lassen – an der Stelle sollte euch der Installer anbieten, Android-x86 zu starten. Tut das. Wählt eine Sprache aus – ich habe es bei „English (United States)“ belassen.

    Es kann sein (ist bei mir passiert), dass das Ding nach der Sprachabfrage abstürzt und wieder beim Grub-Prompt vom Installer ist. Wenn das passiert, beendet die qemu (also: Control-C in deren Fenster) und schaut unten bei VM starten nach der Kommandozeile, die die Qemu ohne Installer hochzieht. Wir haben es hier mit kommerzieller Software zu tun, Gesundbooten ist also durchaus eine legitime Option.

    Jedenfalls: nach der Sprachwahl will das Ding ins Netz, und ich fürchte, es hat keinen Sinn, ihm das zu verbieten. Sucht also nach Netzen. Ihr solltet genau eines sehen, VirtWifi oder sowas. Wählt das aus, seufzt und lasst schon mal auf eurem richtigen Rechner ein tcpdump -n laufen, um zu sehen, womit euer neues Android so alles plaudert (vgl. Die Wunden lecken).

    Das „Checking for Updates“ hat bei mir über Minuten hinweg 100% CPU verbraten (mensch will gar nicht wissen, was es dabei zu rechnen gibt). Da ich den „ich tu gerade was“-Feedback im emulierten Android generell nicht so prall finde, könnt ihr die Zeit ja damit verbringen, eure CPU-Last-Anzeige am Desktop auf Vordermann zu bringen (mein Tipp: wmcore).

    Dann fragt Android, ob es von irgendwoher eure Daten ziehen kann. Klar: das hätte Google gerne. Zum Glück gibts einen kleinen „Don't Copy“-Knopf. Genauso ist auch der Skip-Knopf im nächsten Dialog, dem Google-Signin, ziemlich klein, und Google nervt extra nochmal, wenn mensch ihn wählt. Wählt ihn trotzdem. Date and Time sind zur Abwechslung problemlos abnickbar, dann kommt ein Dialog zu „Google Services“, die mensch alle manuell ausschalten muss.

    Das ist offenbar die Benutzerfreundlichkeit („User Experience“), über deren Mangel im Free Software-Bereich ich immer so viel höre. Ums Akzeptieren, dass Google immer und zu jeder Zeit Kram auf die VM packen kann, kommt mensch glaube ich nicht rum. Aber dafür ist es ja auch eine VM.

    Den folgenden „Protect your Tablet“-Dialog finde ich interessant, weil die Benutzerführung, die mir gerade noch Vertrauen zu Google überhelfen wollte, nun Misstrauen gegen andere Menschen sät, und das gleich noch mit einem zweiten Extra-Mahn-Dialog, wenn ich keine Lust auf Geräte-PINs habe. Also ehrlich: wenn ich mit Google zu tun habe, mache ich mir doch über menschliche DiebInnen keine Sorgen…

    Die abschließende Frage nach der Home-App verstehe ich auch nicht. Macht einfach irgendwas. Damit seid ihr im Android-Geschäft.

    Apps ohne App-Store

    (7) Home-Screen aufräumen – Wenn ihr gleich mal den „Home-Screen“ aufräumen wollt: jeweils lang ein Icon klicken und ziehen. Dann erscheint ein „Remove“-Feld, auf das ihr das Icon ziehen könnt. Macht das am besten mit allem außer dem Chrome. Den brauchen wir gleich. Die widerliche Google-Bar lässt sich, glaube ich, mit diesem Mitteln nicht entfernen. Wozu auch – der Container gehört ja, wie ihr gerade abgenickt habt, sowieso Google.

    (8) Bahn-App finden – Die Bahn veröffentlicht, so weit ich weiß, keine APKs (also Pakete) von ihrer App. Insofern müsst ihr …

  • Running a Current Zoom Package on 32-bit Debian

    [There's a TL;DR at the end of this rant in case you're just desperate to get your Zoom client to work again]

    There are many reasons why proprietary, non-interoperable services are a bane and why I generally say no to them, be them Twitter, Discourse, or Google Docs. However, I regrettably cannot say no to Zoom, even though there are perfectly Free alternatives like, say, mumble plus perhaps VNC if you need screen sharing. Still, just about everyone simply expects you to cope with this heap of proprietary lock-in. Granted, at least it doesn't require downloading and running ridiculous amounts of Javascript code into your web browser each time you run it like some other telecon systems I could mention (including Free ones).

    Upgrade now! But… to what?

    And one has to give the Zoom client that its audio interface used to be a lot better than what today's major browser vendors have to offer (minor browsers can't run the telecon crapware anyway).

    One strong reason to say no even to Zoom, however, is the feeling of helplessness when the software depends on a central server and that server suddenly locks you out. That happened to me with Zoom last week, when the damn thing started to say „you need to upgrade to connect to this conference“ or something to that effect.

    After I have been running the probably multiply exploitable, ugly binary package for perhaps two years without ever upgrading (Zoom doesn't operate a proper repository, and hence there's no apt-upgrading it), that was not unreasonable per se. Except that even manually pulling a .deb from Zoom's Javascript-infested web pages did not help. Umm, what?

    It turns out that the Zoom folks do not bother any more to update their 32-bit package and simply distribute the 5.4.whatever that they're locking out. No amount of installing that changed Zoom's refusal to let me into the conference, and working with the telecon host to see whether any setting in the management interface would let me in again went nowhere. It is this feeling and fact of infant-like helplessness that I so detest about being forced into proprietary technologies.

    How great would it be if Zoom were proper Free software and I could build this myself? How even greater if it were built on open standards and I could just switch to an alternative client? Or cobble together my own?

    But no, I had to eat dirt. I fetched the 64-bit version and dpkg -i-ed it. I had already run:

    dpkg --add-architecture amd64
    

    on that box a long time ago, so I figured Debian's dependency resolution magic should cover the rest. How wrong I was.

    Typing zoom after dpkg -i zoom-amd64.deb followed by apt install -f to satisfy dependencies brought back the command prompt right away – but not zoom window. The program just silently crashed. What?

    In such a situation, the first thing I do is run strace to see what syscalls the program does before dying. However, the output made no sense at all, starting with chdir(0x01). This would have to immediately crash (there certainly is no path name at the memory address 1), but the strace of Zoom instead went on for a few pages. Hu?

    After having been stumped for a few minutes, it dawned on me that stracing an amd64 binary will plausibly require an amd64 strace, so I typed:

    apt install strace:amd64
    

    and tried again. This time, strace's output made a lot more sense, and right before dying, it said:

    stat("/usr/lib", {st_mode=S_IFDIR|0755, st_size=32768, ...}) = 0
    writev(2, [{iov_base="/opt/zoom/zoom", iov_len=14},
     {iov_base=": ", iov_len=2},
     {iov_base="error while loading shared libraries", iov_len=36},
     {iov_base=": ", iov_len=2}, {iov_base="libpango-1.0.so.0", iov_len=17},
     {iov_base=": ", iov_len=2}, {iov_base="cannot open shared object file", iov_len=30},
     {iov_base=": ", iov_len=2}, {iov_base="No such file or directory", iov_len=25},
     {iov_base="\n", iov_len=1}], 10) = 131
    

    In other words: It tried to load the pango library (which draws text strings in Gtk and elsewhere) but failed. What? I installed this from a Debian package and it has not noticed the dependency?

    Well, the Zoom folks clearly got it wrong and fooled the machine into accepting the 32-bit libraries (which of course are already on the box) for 64-bit dependencies, which simply cannot work. I am not going to do research for a commercial entity and hence just gritted my teeth. Repeatedly letting the thing crash, I eventually figured out that I need to manually say:

    apt install libxcb-keysyms1:amd64 libglib2.0-0:amd64\
      libxcb-shape0:amd64 libxcb-shm0:amd64 libxcb-xfixes0:amd64\
      libxcb-randr0:amd64 libxcb-image0:amd64 libfontconfig1:amd64\
      libgl1-mesa-glx:amd64 libegl1-mesa:amd64 libxi6:amd64\
      libsm6:amd64 libxrender1:amd64 libpulse0:amd64 libxcomposite1:amd64\
      libxslt1.1:amd64 libsqlite3-0:amd64 libxcb-xtest0:amd64\
      libdbus-1-3:amd64 libxtst6:amd64 libxcb-xinerama0:amd64\
      libxkbcommon-x11-0:amd64 desktop-file-utils:amd64\
      libgbm1:amd64 libdrm2:amd64 libfreetype6 libatk-bridge2.0-0:amd64\
      libxrandr2:amd64 libpango-1.0-0:amd64 libcairo2:amd64 libcups2:amd64
    

    to have the damn thing at least not crap out during startup. What, if I may ask, does it need cups for?

    Alas, that's still not good enough. While Zoom at least did not immediately terminate any more, it still did not properly connect either. This time, strace -f-ing ends with:

    5783  poll([{fd=3, events=POLLIN}, {fd=13, events=POLLIN}, {fd=14, events=POLLIN}], 3, 0) = 0 (Timeout)
    5783  poll([{fd=3, events=POLLIN}, {fd=13, events=POLLIN}, {fd=14, events=POLLIN}], 3, 14995 <unfinished ...>
    5789  <... futex resumed>)              = -1 ETIMEDOUT (Connection timed out)
    5789  futex(0x7fa62a77b4d0, FUTEX_WAKE_PRIVATE, 1) = 0
    an then a lot of
    5789  +++ exited with 0 +++
    ...
    5783  +++ exited with 0 +++
    

    Oh great. A successful exit even though the program failed. The joys of commercial software development.

    But be that as it may: it is failing because whatever should be feeding file descriptor 3 apparently is not fast enough. The question is: what? Well, let's see what this file descriptor 3 is. In my vi, I'm grepping through the strace protocol for a call in process 5783 returning three like this:

    ?5783.*= 3$
    

    and I find:

    5783  socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0) = 3
    5783  connect(3, {sa_family=AF_UNIX, sun_path=@"/tmp/.X11-unix/X0"}, 20) = 0
    

    Oh dang. The thing is waiting for the X server when it dies? Why would the X server time out? Spoiler: That I have not found out. But quite close to this I saw that Zoom also opens the file ~/.zoom/zoom_stdout_stderr.log and dumps the messages I was missing on stderr there. In fact, I could have gathered the missing libraries from that file rather than strace had I known about it.

    What did I find in there? Well:

    ZoomLauncher started.
    Zoom path is: /opt/zoom
    cmd line:
    Start subprocess: /opt/zoom/zoom sucessfully,  process pid: 5916
    sh: 1: pactl: not found
    

    This looks almost as if it needs pulseaudio? But what about the timeout on the X socket? I don't know, but I can report that installing pulseaudio-utils did fix the pactl failure (if you want to call it that), but it still did not make the thing run. At least according to strace:

    7193  execve("/bin/sh", ["sh", "-c", "pactl --version"], 0x7ffd6d009b08 /* 38 vars */ <unfinished ...>
    ...
    7193  <... execve resumed>)             = 0
    

    it's not because pactl would fail, though frankly it would seem a bit odd that Zoom is calling an external binary in the first place and then go through the shell on top – what's wrong with execve and friends? Zoom, however, still exits on the X timeout:

    7223  poll([{fd=3, events=POLLIN}, {fd=13, events=POLLIN}, {fd=14, events=POLLIN}], 3, 14994 <unfinished ...>
    [...]
    7223  <... poll resumed>)               = 0 (Timeout)
    

    At this point, I seemed to be at a dead end: do I really want to debug whatever Zoom's Qt basis had to work out with X11 that would make X11 fail that dramatically?

    On a wild guess, I suspected some stale setting now that I had noticed there is a .zoom directory. I hence moved that away. Lo and behold: suddenly the messages so far hidden in .zoom arrived on stderr. And it now said “No PulseAudio daemon running, or not running as session daemon“. Aw, bother. That the Zoom client properly dealt with plain ALSA was, frankly, one of the reasons I sort of gave in to /oom. That's now over, too.

    I sprinkled the box with a bit of holy water (something to the effect of pulseaudio --start), and /oom finally came up and connected to their nasty, proprietary server that had locked me out with the 32-bit client.

    Too long; didn't read

    So… If you're in the same situation as a was and Zoom's servers lock you out: I'd hope that installing the extra libraries and pulseaudio and then moving the .zoom subdirectory out of the way – I'd probably not remove it altogether immediately, as it might contain credentials or a Zoom bitcoin wallet or whatever – might very well get you back into the next boring and suprerfluous telecon.

  • Making Linux React to Power Gain and Loss

    Photo of a mains switch built into a power socket

    This is what this post is about: having a single switch for monitor, amplifier, and computer.

    I use an oldish notebook with a retired monitor and an amplifier I picked up from kerbside junk to watch TV („consume visual media“, if you prefer), and I want all that hardware to be switched on and off using a single power switch (see, um… Figure 1).

    Given that the notebook's battery still is good for many minutes, it's a perfectly reasonable stand-in for a UPS. Hence, my problem is quite like the one in the ancient days when big-iron servers had UPSes with just enough juice to let them orderly shut down when the power grid was failing. This used to involve daemons watching a serial line coming in from the UPS. Today, with ACPI in almost every x86 box and batteries in many of them, it's quite a bit simpler.

    This post shows how to automatically power (up and) down with acpid. If you run systemd, you probably will have to do a few extra tweaks to keep it from interfering. Please write in if you figure them out or if things just work.

    Make the Box Wake Up On Power

    The first half is to configure the machine to wake up when mains power returns. Notebooks typically don't do that out of the box, but most ACPI firmwares can be configured that way. On my box, a Thinkpad X230 controlled by a legacy BIOS rather than UEFI, it is a setting in the BIOS setup pages[1]. If you boot through UEFI, you may be able to do this configuration from within the Linux (please write in if you can provide details on that).

    Having said that, let me, perhaps only loosely relatedly, mention /proc/acpi/wakeup, which may play a role in this for you (although it does not on the X230). If you cat this file, you will see something like:

    LID       S4    *enabled   platform:PNP0C0D:00
    SLPB      S3    *enabled   platform:PNP0C0E:00
    IGBE      S4    *enabled   pci:0000:00:19.0
    EXP3      S4    *disabled  pci:0000:00:1c.2
    XHCI      S3    *enabled   pci:0000:00:14.0
    EHC1      S3    *enabled   pci:0000:00:1d.0
    EHC2      S3    *enabled   pci:0000:00:1a.0
    HDEF      S4    *disabled  pci:0000:00:1b.0
    

    Whatever is enabled here will wake the machine up, sometimes depending on whether it is hibernating or just suspended. There are various events that could cause a wakeup, such as when the lid is opened (in the ACPI lingo used here, LID), when a Wake-on-LAN packet arrives (IGBE), when the sleep/power button is pressed (SLPB) or when someone puts in a signal via USB (XHCI, EHC1, ECH2; typically, that would be a keyboard)[2]. To change this, you echo the respective string into the file, which toggles the enabledness:

    $ echo LID | sudo tee /proc/acpi/wakeup
    LID
    $ cat /proc/acpi/wakeup | grep LID
    LID       S4    *disabled  platform:PNP0C0D:00
    

    If there's no obvious BIOS setting for waking up the machine on power, look for something like PWR in /proc/acpi/wakeup. Incidentally, disabling wakeup sources here may actually conserve battery power when hibernating.

    Make the Box Hibernate on Mains Loss

    The second half is that the machine should go into hibernation when I flip the central power switch. A straightforward way to get there is to talk to the acpid. It seems it is still standard on PC-style hardware even when there is systemd.

    So, let us configure it to call an appropriate script when it switches to battery mode (i.e., the power has gone). You can do that sufficiently well by writing:

    # /etc/acpi/events/battery
    # Called when AC power goes away and we switch to battery
    
    event=battery.*
    action=/etc/acpi/to-battery.sh
    

    to /etc/acpi/events/battery. The Debian-distributed acpid already has that file, but it calls the script power.sh, which, as delivered, does something entirely different; you could modify power.sh to your liking, but it's cleaner to use a different, custom script, for instance, because it is less hassle on dist-upgrades. Disclaimer: This will fire too often, namely both on power up and down. However, at least on my hardware that doesn't hurt, and it doesn't seem acpid generates different events for battery in/out.

    Then create the script /etc/acpi/to-battery.sh. I've written this there:

    #!/bin/sh
    
    sleep 2
    if [ `cat /sys/class/power_supply/AC/online` -eq 1 ]; then
      exit
    fi
    
    # x230 specials; you probably won't need them
    buslist="pci i2c"
    for bus in $buslist; do
      for i in /sys/bus/$bus/devices/*/power/control; do
          echo on > $i
      done
    done
    
    logger "powerbutton-acpi-support enter"
    sync
    sync
    echo platform > /sys/power/disk
    echo disk > /sys/power/state
    logger "powerbutton-acpi-support leave"
    
    (sleep 12; ntpdate pool.ntp.org) &
    # this is just an example of an extra hack for resetting a TV
    # card that would be broken after the wakeup.
    (sleep 2; logger reloading tv; /usr/local/bin/uhubctl -l 1-1 -a cycle) &
    

    This thing first waits two seconds and then ensures AC is really gone before doing anything else; this is because on my box I occasionally received spurious power loss notifications, and hibernating the box just when something interesting was on TV has interrupted the rare moments of enjoyable programming a few times too often. Besides, this will catch cases where the battery event is generated by power coming back.

    After that, I'm running a few specials where I enable power management on the PCI and I²C busses of the machine. That has been broken for some reason or another at least on one kernel version or another on this particular box. I've left it the script above in as an inspiration for how you could intervene if something doesn't quite work and needs some fiddling.

    It then proceeds to sync the disk, just in case something goes wrong on suspend or resume and eventually does a low-level hibernation. You could probably use pm-hibernate or some systemd facility just as well, but I personally have found the direct operation of /sys/power to be at the same time the least hassle and the least fragile option (at least if you're prepared to write a few lines of script like the bus loop in my example).

    The last two commands – an NTP update and a hack to reset a USB device that is confused after a wakeup – are executed as part of the wakeup (but in background shells so the box is quickly responsive again). Adapt to your needs.

    Enjoy – and conserve energy by centrally pulling power from all the greedy little wall plug transformers.

    [1]On the X230, to change it I had to press Enter immediately after power-up, then F1, and then navigate to “Power On with AC Attach“ in the Config pane – but regrettably, there's nothing even resembling a standard there, and given this is tech supposedly obsolete since, what, 15 years, I don't think there will ever be one.
    [2]In case you're wondering what HDEF is: Well, it's audio, according to other things ACPI. What I don't know is how to make the audio hardware send a wakeup sinal. I've tried plugging in a headphone, and that didn't work. If you know more… well, again, totally feel free to write in.
  • How to Block a USB Port on Smart Hubs in Linux

    Lots of computer components (a notebook computer with its cover removed

    Somewhere beneath the fan on the right edge of this image there is breakage. This post is about how to limit the damage in software until I find the leisure to dig deeper into this pile of hitech.

    My machine (a Lenovo X240) has a smart card reader built in, attached to its internal USB. I don't need that device, but until a while ago it did not really hurt either. Yes, it may draw a bit of power, but I'd be surprised if that were more than a few milliwatts or, equivalently, one level of screen backlight brightness; at that level, not even I will bother.

    However, two weeks ago the thing started to become flaky, presumably because the connecting cable is starting to rot. The symptom is that the USB stack regularly re-registers the device, spewing a lot of characters into the syslog, like this:

    Aug 20 20:31:51 kernel: usb 1-1.5: USB disconnect, device number 72
    Aug 20 20:31:51 kernel: usb 1-1.5: new full-speed USB device number 73 using ehci-pci
    Aug 20 20:31:52 kernel: usb 1-1.5: device not accepting address 73, error -32
    Aug 20 20:31:52 kernel: usb 1-1.5: new full-speed USB device number 74 using ehci-pci
    Aug 20 20:31:52 kernel: usb 1-1.5: New USB device found, idVendor=058f, idProduct=9540, bcdDevice= 1.20
    Aug 20 20:31:52 kernel: usb 1-1.5: New USB device strings: Mfr=1, Product=2, SerialNumber=0
    Aug 20 20:31:52 kernel: usb 1-1.5: Product: EMV Smartcard Reader
    Aug 20 20:31:52 kernel: usb 1-1.5: Manufacturer: Generic
    Aug 20 20:31:53 kernel: usb 1-1.5: USB disconnect, device number 74
    Aug 20 20:31:53 kernel: usb 1-1.5: new full-speed USB device number 75 using ehci-pci
    [as before]
    Aug 20 20:32:01 kernel: usb 1-1.5: new full-speed USB device number 76 using ehci-pci
    Aug 20 20:32:01 kernel: usb 1-1.5: New USB device found, idVendor=058f, idProduct=9540, bcdDevice= 1.20
    Aug 20 20:32:01 kernel: usb 1-1.5: New USB device strings: Mfr=1, Product=2, SerialNumber=0
    [as before]
    Aug 20 20:32:02 kernel: usb 1-1.5: USB disconnect, device number 76
    

    And that's coming back sometimes after a few seconds, sometimes after a few 10s of minutes. Noise in the syslog is never a good thing (even when you don't scroll syslog on the desktop), as it will one day obscure something one really needs to see, and given that device registrations involve quite a bit of computation, this also is likely to become relevant power-wise. In short: this has to stop.

    One could just remove the device physically or at least unplug it. Unfortunately, in this case that is major surgery, which in particular would involve the removal of the CPU heat sink. For that I really want to replace the thermal grease, and I have not been to a shop that sells that kind of thing for a while. So: software to the rescue.

    With suitable hubs – the X240's internal hub with the smart card reader is one of them – the tiny utility uhubctl lets one cut power to individual ports. Uhubctl regrettably is not packaged yet; you hence have to build it yourself. I'd do it like this:

    sudo apt install git build-essential libusb-dev
    git clone https://github.com/mvp/uhubctl
    cd uhubctl
    prefix=/usr/local/ make
    sudo env prefix=/usr/local make install
    

    After that, you have a program /usr/local/sbin/uhubctl that you can run (as root or through sudo, as it needs elevated permissions) and that then tells you which of the USB hubs on your system support power switching, and it will also tell you about devices connected. In my case, that looks like this:

    $ sudo /usr/local/sbin/uhubctl
    Current status for hub 1-1 [8087:8000, USB 2.00, 8 ports, ppps]
      Port 1: 0100 power
      [...]
      Port 5: 0107 power suspend enable connect [058f:9540 Generic EMV Smartcard Reader]
      [...]
    

    This not only tells me the thing can switch off power, it also tells me the flaky device sits on port 5 on the hub 1-1 (careful inspection of the log lines above will reconfirm this finding). To disable it (that is, power it down), I can run:

    $ sudo /usr/local/sbin/uhubctl -a 0 -l 1-1 -p 5
    

    (read uhubctl --help if you don't take my word for it).

    Unfortunately, we are not done yet. The trouble is that the device will wake up the next time anyone touches anything in the wider vicinity of that port, as for instance run uhubctl itself. To keep the system from trying to wake the device up, you also need to instruct the kernel to keep its hands off. For our port 5 on the hub 1-1, that's:

    $ echo disabled > /sys/bus/usb/devices/1-1.5/power/wakeup
    

    or rather, because you cannot write to that file as a normal user and I/O redirection is done by your shell and hence wouldn't be influenced by sudo:

    $ echo disabled | sudo tee /sys/bus/usb/devices/1-1.5/power/wakeup
    

    That, indeed, shuts the device up.

    Until the next suspend/resume cycle that is, because these settings do not survive across one. To solve that, arrange for a script to be called after resume. That's simple if you use the excellent pm-utils. In that case, simply drop the following script into /etc/pm/sleep.d/30killreader (or so) and chmod +x the file:

    #!/bin/sh
    case "$1" in
      resume|thaw)
        echo disabled > /sys/bus/usb/devices/1-1.5/power/wakeup
        /usr/local/sbin/uhubctl -a 0 -l 1-1 -p 5
        ;;
    esac
    exit 0
    

    If you are curious what is going on here, see /usr/share/doc/pm-utils/HOWTO.hooks.gz.

    However, these days it is rather unlikely that you are still leaving suspend and hibernate to pm-utils; instead, on your box this will probably be handled by systemd-logind. You could run pm-utils next to that, I suppose, if you tactfully configured the host of items with baroque names like HandleLidSwitchExternalPower in logind.conf, but, frankly, I wouldn't try that. Systemd's reputation for wanting to manage it all is not altogether undeserved.

    I have tried to smuggle in my own code into logind's wakeup procedures years ago in systemd's infancy and found it hard if not impossible. I'm sure it is simpler now. If you know a good way to make logind run a script when resuming: Please let me know. I promise to amend this post for the benefit of people running systemd (which, on some suitable boxes, does include me).

  • PSA: netsurf 3 does not accept cookies from localhost

    As I have already pointed out in April, I consider simple and compact web browsers a matter of freedom (well, Freedom as in speech, actually), and although there's been a bit of talk about ladybird lately, my favourite in this category still is netsurf, which apparently to this date is lean enough to be runnable on vintage 1990 Atari TT machines. I'll freely admit I have not tried it, but the code is there.

    Yesterday, however, netsurf drove me crazy for a while: I was developing a web site, making sure it works with netsurf. This website has a cookie-based persistent login feature, and that didn't work. I sent my Set-Cookie headers all right – ngrep is your friend if you want to be sure, somewhat like this:

    sudo ngrep -i -d lo cookie port 8888
    

    Ngrep also clearly showed that netsurf really did not send any Cookie headers, so the problem wasn't on the cookie header parsing side of my program, either.

    But why did the cookies disappear? Cookie policy? Ha: netsurf does accept a cookie from Google, and crunching this would be the first thing any reasonable policy would do. Did I perhaps fail to properly adhere to the standards (which is another thing netsurf tends to uncover)? Hm: looking up the cookie syntax spec gave me some confidence that I was doing the right thing. Is my Max-Age ok? Sure, it is.

    The answer to this riddle: netsurf does not store cookies if it cannot sort them into a hierarchy of host names, and it never can do that for host names without dots (as in localhost, for instance). Given the ill-thought-out Domain attribute one can set for cookies (see the spec linked above if you want to shudder), I even have a solid amount of sympathy for that behaviour.

    But given that that is something that will probably bite a lot of people caring about freedom enough to bother with netsurf, I am still a bit surprised that my frantic querying of search engines on that matter did not bring up the slightly unconventional cookie handling of netsurf. Let us hope this post's title will change that. Again, netsurf 3 will not store cookies for not only localhost but any host name without dots in it. Which is a bit inconvenient for development, and hence despite my sympathy I am considering a bug report.

    Meanwhile, I've worked around the problem by adding:

    127.0.0.1 victor.local.de
    

    to my /etc/localhost (the name really doesn't matter as long as it will never clash with an actual site you want to talk to and it contains one or more dots) and access the site I'm developing as http://victor.local.de. Presto: my cookie comes back from netsurf all right.

    A Debugging Session

    So, how did I figure this riddle out? The great thing about Debian and halfway compact software like netsurf is that it makes it reasonably simple to figure out such (mis-) features. Since I firmly believe that the use of debuggers is a very basic skill everyone touching a computer should have, let me give a brief introduction here.

    First, you need to get the package's source. Make sure it matches the version of the program that you actually run; to do that, copy the deb line in /etc/apt/sources.list for the repository the package comes from (note that this could be the security repo if you got updates from there). In the copied line, replace deb with deb-src. In my case, that would be:

    deb-src https://deb.debian.org/debian bullseye main
    

    On a freshly installed Debian, it's likely you already have a line like this; consider commenting out the deb-src lines when not working with source code, as that will make your apt operations a bit faster.

    After an apt update, I can now pull the source. To keep your file system tidy, I put all such sources into children of a given directory, perhaps /usr/src if you're old-school, or ~/src if not:

    cd
    mkdir -p src/netsurf
    cd src/netsurf
    apt-get source netsurf-gtk
    

    I'm creating the intermediate netsurf directory because apt-get source creates four items in the directory, and in case you're actually building a package (which you could, based on this), more entries will follow; keeping all that mess outside of src helps a lot. Note that apt-get source does not need any special privileges. You really should run it as yourself.

    By the way, this is the first part where monsters like webkit make this kind of thing really strenuous: libwebkit sources (which still are missing much over a full browser) pull 26 megabytes of archive expanding to a whopping 300 Megabytes of source-ish goo.

    To go on, enter the directory that apt-get source created; in my case, that was netsurf-3.10. You can now look around, and something like:

    find . -name "*.c" | xargs grep "set-cookie"
    

    quickly brought me to a file called netsurf/content/urldb.c (yeah, you can use software like rgrep for „grep an entire tree“; but then the find/xargs combo is useful for many other tasks, too).

    Since I still suspected a problem when netsurf parses my set-cookie header, the function urldb_parse_cookie in there caught my eye. It's not pretty that that function is such an endless beast of hand-crafted C (rather than a few lines of lex[1]), but it's relatively readable C, and they are clearly trying to accomodate some of the horrible practices out there (which is probably the reason they're not using lex), so just looking at the code cast increasing doubts on my hypothesis of some minor standards breach on my end.

    In this way, idly browsing the source code went nowhere, and I decided I needed to see the thing in action. In order to not get lost in compiled machine code while doing that, one needs debug symbols, i.e., information that tells a debugger what compiled stuff resulted from what source code. Modern Debians have packages with these symbols in an extra repository; you can guess the naming scheme from the apt.sources string one has to use for bullseye:

    deb http://debug.mirrors.debian.org/debian-debug bullseye-debug main
    

    After another round of apt update, you can install the package netsurf-gtk-dbgsym (i.e., just append a -dbgsym to the name of the package that contains the program you want to debug). Once that's in, you can run the GNU debugger gdb:

    gdb netsurf
    

    which will drop you into a command line prompt (there's also a cool graphical front-end to gdb in Debian, ddd, but for little things like this I've found plain gdb to be less in my way). Oh, and be sure to do that in the directory with the extracted sources; only then can gdb show you the source lines (ok: you could configure it to find the sources elsewhere, but that's rarely worth the effort).

    Given we want to see what happens in the function urldb_parse_cookie, we tell gdb to come back to us when the program enters that function, and then to start the program:

    (gdb) break urldb_parse_cookie
    Breakpoint 1 at 0x1a1c80: file content/urldb.c, line 1842.
    (gdb) run
    Starting program: /usr/bin/netsurf
    

    With that, netsurf's UI comes up and I can go to my cookie-setting page. When I try to set the cookie, gdb indeed stops netsurf and asks me what to do next:

    Thread 1 "netsurf" hit Breakpoint 1, urldb_parse_cookie (url=0x56bcbcb0,
        cookie=0xffffbf54) at content/urldb.c:1842
    1842  {
    (gdb) n
    1853    assert(url && cookie && *cookie);
    

    n (next) lets me execute the next source line (which I did here). Other basic commands include print (to see values), list (to see code), s (to step into functions, which n will just execute as one instruction), and cont (which resumes execution).

    In this particular debugging session, everything went smoothly, except I needed to skip over a loop that was boring to watch stepping through code. This is exactly what gdb's until command is for: typing it at the end of the loop will fast forward over the loop execution and then come back once the loop is finished (at which point you can see what its net result is).

    But if the URL parsing went just fine: Why doesn't netsurf send back my cookie?

    Well, tracing on after the function returned eventually lead to this:

    3889      suffix = nspsl_getpublicsuffix(dot);
    (gdb)
    3890      if (suffix == NULL) {
    

    and a print(suffifx) confirmed: suffix for localhost is NULL. Looking at the source code (you remember the list command, and I usually keep the source open in an editor window, too) confirms that this makes netsurf return before storing the freshly parsed cookie, and a cookie not stored is a cookie not sent back to the originating site. Ha!

    You do not want to contemplate how such a session would look like with a webkit browser or, worse, firefox or chromium, not to mention stuff you don't have the source …

  • Von Geburtstagen und /etc/papersize

    Ich bin ein stiller Fan des Debian-Pakets installation-birthday. Das hat mir vorhin eine Mail geschrieben:

    Date: Mon, 11 Apr 2022 11:04:11 +0200
    From: Anacron <root@hostname-withheld>
    To: root@hostname-withheld
    Subject: Anacron job 'cron.daily' on hostname-withheld
    
    /etc/cron.daily/installation-birthday:
    
                      0   0
                      |   |
                  ____|___|____
               0  |~ ~ ~ ~ ~ ~|   0
               |  |           |   |
            ___|__|___________|___|__
            |/\/\/\/\/\/\/\/\/\/\/\/|
        0   |       H a p p y       |   0
        |   |/\/\/\/\/\/\/\/\/\/\/\/|   |
       _|___|_______________________|___|__
      |/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/|
      |                                   |
      |         B i r t h d a y! ! !      |
      | ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ |
      |___________________________________|
    
    
    Congratulations, your Debian system "hostname-withheld" was installed
    15 year(s) ago today!
    
    
    Best wishes,
    
    Your local system administrator
    

    Oh wow. So ein glatter Geburtstag ist doch eigentlich ein Grund zu feiern.

    Er ist aber auch ein Grund zum Nachdenken. Irgendwas kann da nicht stimmen. Bevor meine Haupt-Arbeitsmaschine (deren Filesystem so etwa aus dem Jahr 1996 stammt) ein Debian-System wurde, habe ich das selbst gestrickt und gebaut. Das habe ich mir 2007 ganz sicher nicht mehr angetan, denn die Zahl der auf einem durchschnittlichen Desktop-Linux verbauten Zeilen und irgendwie zusammenzufummelnden Komponenten war schon ein paar Jahre vorher deutlich jenseits meiner persönlichen Schmerzgrenze. Was aber ist dann 2007 passiert?

    Nun: wozu gibt es die Quellen? Im Fall von installation-birthday führt ein schneller Blick in die cron-Datei, die in der Mail erwähnt ist, auf ein Python-Skript, das das Installations-Datum berechnet als:

    for candidate in self.gen_installation_datetimes():
      # Use the oldest mtime we can find
      if dt is None or candidate < dt:
        self.log.debug("Preferring %s over %s", candidate, dt)
        dt = candidate
    

    gen_installation_datetime nun geht durch die Timestamps des Root-Filesystems, von /var/log/installer (gibts bei mir nicht), /var/log/bootstrap.log (auch nicht), /var/lib/vim (da kommt das Datum her und reflektiert so wohl eher irgendein Datum in der Entwicklung des Pakets, wahrscheinlich die Einführung formaler Addons), /root und /etc/machine-id (diese Datei ist von 2014, und eine kurze Recherche zeigt: Da habe ich zum ersten Mal versucht, mich mit systemd anzufreunden).

    So romatisch es ist, mein Systemalter anhand von vi-Hilfsdateien zu schätzen: Das will ich schon anders haben. Wie aber bekomme ich das Migrationsdatum raus?

    In meinem Dateisystem sind noch viele Dateien, die auf das selbstgestrickte Linux zurückgehen (und weitere, die von noch älteren Maschinen stammen, aber die beschränken sich auf /home), ich kann also nicht einfach nur die älteste Datei suchen. Es gibt aber eine interessante Häufung von Timestamps in /etc/ am 7. Juli 2005, darunter adduser.conf und papersize, die sehr nach etwas klingen, das Debian beim Überbügeln des Selbstbausystems angelegt haben könnte.

    Umgekehrt gibts davor eigentlich nichts, das irgendwie nach Debian aussieht; der Timestamp von /etc/papersize wird es dann wohl sein.

    Jetzt wollte ich es genauer wissen, und weil ich zu faul bin, aus der ls-Manpage oder sonstwoher rauszupopeln, wie ich ohne Code an die Uhrzeit der letzten Änderung herankomme, muss es ein Python-Einzeiler tun (ja, sowas gibts):

    $ python -c "import os, datetime; print(datetime.datetime.utcfromtimestamp(os.path.getmtime('papersize')))
    2005-07-07 12:48:37
    

    Also: am Donnerstag, 7.7.2005, so gegen drei am Nachmittag, habe ich offenbar meine Arbeitsmaschine auf Debian umgestelt. Uiuiui. Dass ich das Selbstbauen doch so lange durchgehalten habe, hatte ich gar nicht mehr in Erinnerung.

    Damit in Zukunft installation-birthday das richtige Datum nimmt, habe ich gerade noch:

    $ sudo touch -r /etc/papersize /var/log/installer
    

    gesagt. Damit gibt es ein /var/log/installer mit dem Timestamp von /etc/papersize, das jetzt installation-birthday das richtige Signal gibt:

    $ installation-birthday --verbosity 1
    I: Installation date: 2005-07-07
    

    Ich freue mich schon auf den 7.7. Das ist übrigens in diesem Jahr, wie schon 2005, wieder ein Donnerstag.

  • View with Netsurf

    A screenshot of a browser window

    An early version of this post rendered in netsurf.

    I believe about the worst threat to software freedom these days is web browsers. That is not only because they already are, for many people out there, a more relevant applications platform than their primary operating system, and that almost everything that gets run in them is extremely non-Free software. I've been linking to a discussion of this problem from these pages since this blog's day one as part of my quip on “best viewed with javascript disabled“.

    No, they are also a threat because the “major” browser engines are so humunguous that they are in effect locking out most platforms (which simply don't have enough power to run them). And they are because the sheer size and complexity of their code bases make it essentially impossible for an individual to fix almost any relevant bug in them related to rendering, javascript execution, or network interactions.

    That is why I am so grateful to the authors and maintainers of both dillo (Debian: dillo) and netsurf (Debian: netsurf-gtk, mainly), small browsers with maintainable code bases. While dillo is really basic and is missing so much of CSS and modern HTML that on today's web even many non-adversarial sites become barely usable, netsurf is usually just fine for websites respecting user rights.

    Flex layouts and the article elements: The good part of 20 years of web development after the Web 1.0.

    I have to admit I nevertheless only use it in very specific contexts, mostly because luakit with its vi-like key bindings and lua extensiblity in the end usually wins out even though I don't trust the webkit rendering engine for two cents[1]. And that is why I hadn't noticed that this blog has rendered a lot worse than it should have in netsurf. This is particularly shameful because that was mostly because I have taken liberties with web standards that I should not have taken. Apologies: Netsurf was right and I was wrong.

    I have improved that quite a bit this morning. Given I am using flex layouts quite liberally here, and these don't work in Debian stable's netsurf, the rendered pages do look quite a bit different in netsurf than on the “major” browsers. But the fallbacks are ok as far as I am concerned. Since flex layouts are among the few “innovations“ in the post-Web 1.0 ecosystem that are actually a good idea, I gladly accept these fallbacks. Let me stress again that it is a feature of a friendly web rather than a bug that pages look different in different user agents.

    Dillo, regrettably, is another matter because of the stupid^Wunderconsidered colour games I'm playing here. As things are right now, the light background below text like this one sits on an HTML5 article element, which dillo ignores. Hence, the text is black on dark green, which, well, may be barely readable but really is deeply sub-optimal. Since I consider the article element and its brethren real progress in terms of markup (the other positive “innovation” post Web-1.0), I will not change that markup just to make this render better in dillo. I may finally re-think the silly dark green background soon-ish, though.

    [1]If you feel like this, too, let's team up and massage luakit's front end to work with netsurf's rendering engine. Given the close entanglement of luakit with the webkitgtk API, this certainly will result in a very different program, and almost certainly there would be no way to re-use luakit extensions. Still, I could very well see such a thing become my main browser.
  • 'Failed to reset ACL' with elogind: Why?

    As I've blogged the other day, I like having my machine's syslog on the screen background so I notice when the machine is unwell and generally have some idea what it thinks it is doing. That also makes me spot milder distress signals like:

    logind-uaccess-command[30337]: Failed to reset ACL on /dev/bus/usb/002/061: Operation not supported
    

    I've ignored those for a long time since, for all I can see, logind-like software does nothing that on a normal machine sudo and a few judicious udev rules couldn't do just as well – and are doing on my box. The only reason there's elogind (a logind replacement that can live without systemd) on my box is because in Debian, kio – which in bullseye 270 packages depend upon – depends upon something like logind. The complaints in the syslog thus came from software I consider superfluous and I'd rather not have at all, which I felt was justification enough to look the other way.

    But then today curiosity sneaked in: What is going on there? Why would whatever elogind tries break on my box?

    Well, the usual technique of pasting relevant parts of the error message into some search engine leads to elogind PR #47 (caution: github will run analytics on your request). This mentions that the message results from a udev rule that tries to match hotplugged devices with users occupying a “seat”[1]. The rule calls some binary that would make sure that the user on the “seat” has full access to the device without clobbering system defaults (e.g., that members of the audio group can directly access the sound hardware) – and to keep the others out[2]. The Unix user/group system is not quite rich enough for this plan, and hence a thing called POSIX ACLs would be used for it, a much more complicated and fine-grained way of managing file system access rights.

    Well, the udev rules mentioned in the bug indeed live on my box, too, namely in /lib/udev/rules.d/73-seat-late.rules, which has the slightly esoteric:

    TAG=="uaccess", ENV{MAJOR}!="", RUN{program}+="/lib/elogind/elogind-uaccess-command %N $env{ID_SEAT}"
    

    I frankly have not researched what exactly adds the uaccess tag that this rule fires on, and when it does that, but clearly it does happen in Debian bullseye. Hence, this rule fires, and thus the failing elogind-uaccess-command is started.

    But why does it fail? Well, let's see what it is trying to do. The great thing about Debian is that as long as you have a (proper) deb-src line in your /etc/apt/sources.list, you can quickly fetch the source code of anything on your box:

    cd /usr/src  # well, that's really old-school.  These days, you'll
                 # probably have your sources somewhere else
    mkdir elogind # apt-get source produces a few files
    cd elongind   # -- keep them out of /usr/src proper
    apt-get source elogind
    cd <TAB>  # there's just one child directory
    

    To see where the source of the elongind-uaccess-command would be, I could have used a plain find, but in cases like these I'm usually lazy and just recursively grep for sufficiently specific message fragments, as in:

    find . -name "*.c" | xargs grep "reset ACL"
    

    This brings up src/uaccess-command/uaccess-command.c, where you'll find:

    k = devnode_acl(path, true, false, 0, false, 0);
    if (k < 0) {
             log_full_errno(errno == ENOENT ? LOG_DEBUG : LOG_ERR, k, "Failed to reset ACL on %s: %m", path);
             if (r >= 0)
                     r = k;
     }
    

    Diversion: I like the use of the C ternary operator to emit a debug or error message depending on whether or not things failed because the device file that should have its ACL adapted does not exist.

    So, what fails is a function called devnode_acl, which does not have a manpage but can be found in login/logind-acl.c. There, it calls a function acl_get_file, and that has a man page. Quickly skimming it would suggest the prime suspect for failures would be the file system, as that may simply not support POSIX ACLs (which, as I just learned, aren't really properly standardised). Well, does it?

    An apropos acl brings up the chacl command that would let me try acls out from the shell. And indeed:

    $ chacl -l /dev/bus/usb/001/003
    chacl: cannot get access ACL on '/dev/bus/usb/001/003': Operation not supported
    

    Ah. That in fact fails. To remind myself what file system we are talking about, I ran mount | grep "/dev " (the trailing blank on the search pattern is important), which corrected my memory from “it's a tmpfs” to “it's a devtmpfs”; while it turns out that the difference between the two does not matter for the problem at hand, your average search engine will bring up the vintage 2009 patch at https://lwn.net/Articles/345480/ (also from the abysses from which systemd came) when asked for “devtmpfs acl”, and a quick skim of that patch made me notice:

    #ifdef CONFIG_TMPFS_POSIX_ACL
    (something)
    

    This macro comes from the kernel configuration. Now, I'm still building the kernel on my main machine myself, and looking at the .config in my checkout of the kernel sources confirms that I have been too cheap to enable POSIX ACLs on my tmpfses (for a machine with, in effect, just a single user who's only had contact with something like POSIX ACLs ages ago on an AFS, that may be understandable).

    Well, I've enabled it and re-built my kernel, and I'm confident that after the next reboot the elogind messages will be gone. And who knows, perhaps the thing may actually save me a custom udev rule or two in the future because it automagically grants me access to whatever I plug in.

    Then again: Given there's now an API for Javascript from the web to read USB devices (I'm not making this up) and at least so far I'm too lazy to patch that out of my browsers… perhaps giving me (and hence these browsers) that sort of low-level access is not such a good idea after all?

    [1]See Multiseat on Wikipedia if you have no idea what I'm talking about. If you've read that you can probably see why I consider logind silly for “normal” computers with either a single user or lots of users coming in through the network.
    [2]Mind you, that in itself is totally reasonable: it would suck if everyone on a machine could read the USB key you've just plugged into a terminal; except that it's a rare configuration these days to have multiple persons share a machine that anyone but an administrator could plug anything into.
  • Now on the Fediverse

    Mastodon logo

    AGPL (copyright)

    While I believe that RSS (or rather Atom a.k.a. RFC 4287) is a great standard for subscribing to media like blogs[1], I strongly suspect that virtually nobody pulls my RSS feed. I'm almost tempted to log for a while to ascertain that. Then again, just based on how few people still run RSS aggregators (me, I'm using a quick self-written hack based on python3-feedparser) I am already quite confident the RSS mainly sits idly on my server.

    At least outside of my bubble, I guess what RSS was designed for has been superceded by the timelines of Facebook, Twitter, and their less shopworn ilk. As a DIY zealot, of course none of that is an option for me. What is an option in this field (and what certainly can do with a bit more public attention) is what these days is commonly called the Fediverse, that is, various sites, servers and client programs in the rough vicinity of microblogging, held together by W3C's ActivityPub protocol.

    What this technobabble means in practice: If you already are in the Fediverse, you can follow @Anselm@social.dev-wiki.de and get a toot whenever I post something here (note, however, that most posts will be in German).

    If you're not in the Fediverse yet, well, choose a community[2] – if I had to choose again, I'd probably take a larger community, as that increases one's initial audience: other communities will, for all I understand, only carry your public toots (i.e., messages) if someone in them has subscribed someone from your community –, get a client – I'm using tootle as a GUI and toot for the CLI – and add my Fediverse id.

    To accomodate tooting about new posts, I have made two changes to by pelican tooling: For one, post.py3 now writes a skeleton toot for the new post, like this:

    with open("next-toot", "w", encoding="utf-8") as f:
      f.write(f"{headline} – https://blog.tfiu.de/{slug}.html\n#zuengeln\n")
    

    And I have a new Makefile target:

    toot:
      (cat next-toot; echo "Post?"; read x)
      toot post < next-toot
    

    In that way, when I have an idea what the toot for the article should contain while I'm writing the post, I edit next-toot, and after I've run my make install, I'm doing make toot to notify the Fediverse.

    A side benefit: if you'd like to comment publicly and don't want do use the mail contact below: you can now do that through Mastodon and company.

    [1]That it is a great standard is already betrayed by the fact that its machine-readable specification is in Relax NG rather than XML schema.
    [2]This article is tagged DIY although I'm not running a Mastodon (or other AcitivityPub server) instance myself because, well, I could do that. I don't, for now, because Mastodon is not packaged for Debian (and for all I can tell neither are alternative ActivityPub servers). Looking at Mastodon's source I can understand why. Also, I won't rule out that the whole Fediverse thing will be a fad for me (as was identi.ca around 2009), and if I bother to set up unpackaged infrastructure, I need to be dead sure it's worth it.
  • Fixing "No sandbox user" the Right Way

    I'm setting up an ancient machine – a Pentium M box with a meme 256 MB of RAM – with current Debian bullseye, and I'm impressed that that still works: this machine is almost 20 years old. Hats off to the Debian folks.

    But that's not really my story. Instead, this is about fixing what's behind the message:

    No sandbox user '_apt' on the system, can not drop privileges
    

    from apt. As you probably have just done, my first reaction was to feed that message to a search engine.

    Quite a few pages were returned, and all I looked at suggested to simply create the user using one of the many ways a Debian box has for that. That is not totally unreasonable, but it does not really address the underlying cause, and hence I thought I should do better.

    The immediately underlying cause is that for whatever deeper reason a maintainer script – shell scripts that Debian packages run after installing packages or before removing them – has not properly run; that is usually the place where packages create users and do similar housekeeping. Just creating the user may or may not be enough, depending on what else the maintainer script would have done.

    Hence, the better way to fix things is to re-run the maintainer script, as that would either run the full routine or at least give an error message that lets you figure out the deeper cause of the problem. Dpkg runs the maintainer script(s) automatically when you re-install the package in question.

    But what is that “package in question” that should have created the user? You could guess, and in this particular case your guess would quite likely be right, but a more generally applicable technique is to simply see what script should have created the user. That's not hard to do once you know that the maintainer scripts are kept (next to other package metadata) in /var/lib/dpkg/info/; so, with GNU grep's -r (recursive) option, you can run:

    grep -lr "_apt" /var/lib/dpkg/info/
    

    which gives the names of all files containing _apt in files below that directory. On my box, that is:

    /var/lib/dpkg/info/python3-apt.md5sums
    /var/lib/dpkg/info/libperl5.32:i386.symbols
    /var/lib/dpkg/info/apt.postinst
    /var/lib/dpkg/info/python3-apt.list
    

    Ah-ha! The string is mentioned in the post-installation script of the apt package. Peeking inside this file, you see:

    if [ "$1" = 'configure' ]; then
            # add unprivileged user for the apt methods
            adduser --force-badname --system --home /nonexistent  \
                --no-create-home --quiet _apt || true
    fi
    

    So: this really tries to create the user when the package is being configured, but it ignores any errors that may occur in the process (the || true). That explains why the system installation went fine and I got the warnings later (rather than a hard error during the installation).

    Just re-configuring the apt package would therefore be enough to either fix things or at least see an error message. But really, unless it's a huge package I tend to save on brain cycles and just run apt reinstall, which in this particular case leads to the somewhat funky command line:

    apt reinstall apt
    

    For me, this fixed the problem – and I've not bothered to fathom why the user creation failed during initial system setup. If you've seen the same problem and still have a record of the installation, perhaps you could investigate and file a bug if necessary?

  • Moving Clipboard Content Between Displays and Machines with Xclip

    Since Corona started, I've had to occasionally run zoom and other questionable telecon software. I don't want that proprietary junk on my main machine, partly because I'm a raving Free software lunatic, partly because binary packages from commercial vendors outside of the Debian main repository have a way of blowing up things years after one has put them on a box. I take some pride in never having re-installed my primary machine since 1996, so there would have been lots of opportunity for binary junk to accumulate.

    Hence I took a spare box I had sitting around idly, quickly put a simple Debian on its disk and then dumped all the questionable proprietary code next to its systemd and pulseaudio, reckoning that shredding that file system once the zoom pandemic is over will give me a lot of satisfaction.

    But now the various links, room ids and whatnot come in on the proper machine. Until a few days ago, I used to move them over to the zoom machine by having a screen open there, ssh-ing in from my main box, running screen -x to attach the screen that is already running in the ssh session, and then pasting the link into that shared screen. It works, but it feels clunky.

    The other day, I finally realised there's a better way using a nifty thing called xclip. I had already used xclip for ages whenever I have two displays running on a single box and I need to copy and paste between the two displaye; that happens when I'm at work. Then, I use the following key bindings (in this case for sawfish) on both ends:

    (bind-keys global-keymap "M-C-v"
            '(system "xclip -in < ~/.current-clipboard"))
    (bind-keys global-keymap "M-C-c"
            '(system "xclip -out > ~/.current-clipboard"))
    

    This lets me hit Alt-Ctrl-C on the first display and Alt-Ctrl-V on the second, and I'll then have what was in the primary selection on the first in the primary selection on the second.

    When later webkit on gtk3 started to copy links into the X11 clipboard rather than the primary selection and I wanted a quick way to get them to where I can middle-mouse them in again, I added another xclip binding to my sawfshrc:

    (bind-keys global-keymap "M-RET"
      '(system "xclip -out -selection clipboard | xclip -in"))
    

    – that's Meta-Return copying the content of the clipoard to the primary selection, and I've come to use that one quite extensively after initially piling quite a bit of abuse on the gtk3 policy of using the clipboard.

    What I noticed the other day was that xclip also lets me conveniently transport the telecon links. I've created an alias for that:

    alias zoomclip='xclip -o | ssh zoom "DISPLAY=:0 xclip -r -l 1 -i"'
    

    (zoom here is the name of the target machine). My new workflow is: select the string to transmit, run zoomclip in a terminal, hit the middle mouse button on the target machine to paste what I selected on the source machine. I'm not sure if it saves a lot of time over the old screen-based method, but it sure feels niftier, and I'd say that's reason enough for that alias.

    Note that the DISPLAY=:0 in the remote command is necessary because xclip of course is a normal X client and needs to know what display to talk to; and you want the local display on the target machine, not the display on the source machine. The -l 1, on the other hand, makes the xclip on the remote machine exit once you have pasted the content. Leave the option out if you expect to need to paste the thing multiple times. But without the -l 1, due to the way the selections are built on X11 (i.e, the system doesn't store selection content, you're always directly sending stuff between clients), xclip runs (and hence the ssh connection is being maintained) until some other client takes over the selection.

  • Bluetooth tethering with a 2021 Debian

    The other day the cage that holds the SIM card for the wireless modem of my Lenovo X240 rotted away. Fixing this (provided that's even reasonable, which I'm not sure about) requires digging quite a bit deeper into the machine than I consider proportional for something I use less than once a month. Plus, there's still my trusty N900 cellphone that I can use for the occasional GSM (or, where still available, UMTS) data connection.

    A SIM card cage on a table

    The underlying reason for mucking around tethering bluetooth in 2021: the cage for the SIM card in my computer rotted out. And re-attaching it to the mainboard looks like surgery to deep for summer.

    So, I revived the ancient scripts I used to use around 2005 with feature phones playing cell modem and tried to adapt them to the N900. Ouch.

    Well, full disclosure: I have only hazy notions about what's going on in bluetooth in general, and the Linux tooling for bluetooth I find badly confusing. Hence, rather than reading manpages I started asking duckduckgo for recipes for “bluetooth tethering linux“ or similar. And since what I found was either rather outdated or used various GUI and network management tools I prefer to not have to regularly run, let me write up what I ended up doing to thether my Debian box. Oh: it's using ifupdown and is Debian-specific in that sense, but otherwise I think it's fairly distribution-neutral, contrary to what you might expect after the headline.

    The bluetooth part

    The basic operation for bluetooth tethering is straightforward: Open a serial-like connection (“rfcomm”) to the phone, then start a pppd on top of it. It's the little details that make this tricky.

    The first detail for me is that I have a deep distrust of bluez (and even the bluetooth drivers). I hence keep bluetooth off and blocked most of the time, and before opening any bluetooth connection, I have to manage this bluetooth state, which I'm going to do in a plain shell script. I'm sure there's a very elegant way to do this in systemd, but then I'd say this is a case where the clarity of a shell script is hard to beat.

    So, I created a file /usr/local/sbin/bluenet containing this:

    #!/bin/sh
    # Configure/deconfigure an rfcomm bluetooth connection
    
    DEVICE_MAC=<YOUR PHONE'S BLUETOOTH ID>
    
    case $1 in
      start)
        /usr/sbin/rfkill unblock bluetooth
        /sbin/modprobe btusb
        /usr/sbin/service bluetooth start
        /usr/bin/rfcomm bind /dev/rfcomm0  $DEVICE_MAC
        sleep 2 # no idea what I should really be waiting for here, but
          # bluetooth clearly needs some time to shake out
        ;;
      stop)
        /usr/bin/rfcomm release /dev/rfcomm0
        /usr/sbin/service bluetooth stop
        /sbin/rmmod btusb
        /usr/sbin/rfkill block bluetooth
        ;;
      default)
        echo "$1 start|stop"
        exit 1
    esac
    

    All that you really need if you don't mind having bluetooth on are the two rfcomm command lines; the advantage of having this in a separate script (rather than just hack the rfcomm calls into the ifupdown stanza) is that you can say bluenet stop after a failed connection attempt and don't have to remember what exactly you started and what the stupid rfcomm command line was. Oh: Resist the temptation to keep this script in your home directory; it will be executed as root by ifupdown and should not be writable by your normal user.

    To figure out your phone's bluetooth id, I think what people generally use these days is bluetoothctl (and for interactive use, it's fairly nice, if scantily documented). Essentially, you say scan on in there and wait until you see something looking like your phone (you'll have to temporarily make it discoverable for that, of course). While you're in there, run pair <mac>, too – at least for me, that was really straightforward compared to the hoops you had to jump through to pair bluetooth devices in Linux in the mid-2000s.

    With this, you should be able to say:

    sudo /usr/local/sbin/bluenet start
    

    and then talk to a modem on /dev/rfcomm0. Try it with a terminal program:

    minicom -D /dev/rfcomm0
    

    In minicom, type AT and return, and try again if it doesn't work on the first attempt[1]; the phone should respond with OK. If it does, you're essentially in business. If it doesn't, try rfcomm -a – which should show something like:

    rfcomm0: DE:VI:CE:MA:CA:DD:RE channel 1 clean
    

    Oh, and bluetootctl may be your (slightly twisted) friend; in particular info <mac> has helped me figure out problems. I still don't understand why my N900 doesn't show the rfcomm capability in there, though – again, I'm not nearly enough of a bluetooth buff to tell if that's normal in any way.

    The PPP part

    Assuming, on the other hand, you have your rfcomm connection, the rest is standard pppd fare (which, of course, makes me feel young again). This means you have to give a provider-specific pppd configuration, conventionally in /etc/ppp/peers/bluez (replace bluez with whatever you like), which for me looks like this:

    /dev/rfcomm0
    115200
    debug
    noauth
    usepeerdns
    receive-all
    ipcp-accept-remote
    ipcp-accept-local
    local
    nocrtscts
    defaultroute
    noipdefault
    noipv6
    connect "/usr/sbin/chat -v -f /etc/ppp/chat-bluez"
    
    lcp-echo-interval 300
    lcp-echo-failure 10
    

    Some of this is a good idea whenever there's not actually a serial port in the game (local, noctsrts), some may break your setup (noauth, though I think that's fairly normal today), and the lcp-echo things I found useful to detect lost connections, which of course are rather common on cellular data. Oh, and then there's the noipv6 that you may object to.

    Anyway: you may need to gently adapt your pppd peers file. Use your common sense and browse the pppd man page if you can't help it.

    The chat script /etc/ppp/chat-bluez mentioned in the peers file for me (who's using a German E-Netz reseller) looks like this:

    TIMEOUT 5
    ECHO ON
    ABORT 'ERROR'
    ABORT 'NO ANSWER'
    ABORT 'NO CARRIER'
    ABORT 'NO DIALTONE'
    '' "ATZ"
    OK-ATZ-OK ATE1
    OK 'AT+CGDCONT=1,"IP","internet.eplus.de","0.0.0.0"'
    TIMEOUT 15
    OK "ATD*99***1#"
    CONNECT ""
    

    Essentially I'm doing a modem reset (ATZ), and (to accomodate for the swallowed initial characters mentioned above) I'm trying again if the first attempt failed. The ATE1 enables echo mode to help debugging, and then comes the magic in the CGDCONT (“Define packet data protocol (PDP) context”) AT command – you will have to adapt at least the string internet.eplus.de to your provider's APN (there are public lists of those). The rest you can probably keep as-is; nobody really uses more than one profile (the 1) or a PDP type other than IP (well, there IPV4V6 that might interest you if you've removed the noipv6 statement above), or uses the PDP address in the last argument.

    The final dial command (the ATD) is, I think, fairly standard (it says, essentially: Call the PDP context 1 set up in CGDCONT).

    All this assumes your provider does not require authentication; I think most don't these days. If they do, say hello to /etc/ppp/pap-secrets and the pppd manpage (you have my sympathy).

    Finally, the various components are assembled in a stanza in /etc/network/interfaces:

    iface n900 inet ppp
      pre-up /usr/local/sbin/bluenet start
      provider bluez
      post-down /usr/local/sbin/bluenet stop
    

    That's it – your tethered network should now come up with ifup n900, and you can take it down with ifdown n900. If the thing gets stuck, meaning the interface won't come up as far as Debian is concerned, the post-down action will not be run. Just run bluenet stop manually in that case.

    Amazing endurance

    One thing blew my mind when I did this: A good decade ago, the Nokia N900 didn't come with the profile rfcomm uses enabled (it's called “DUN” there; don't ask me why). It didn't need much to enable it, but you needed to drop some magic file for the upstart init system running in the N900's maemo into some strategic location, and some kind soul had packaged that up. Not expecting much I simply ran apt-get install bluetooth-dun – and it just worked. Whoever still keeps up these ancient maemo repositories: Thanks a lot, and I'm impressed. If we ever meet, remind be to buy you a $BEVERAGE.

    [1]In the communication with the N900 with its ancient bluetooth stack, for some reason the first character(s) tend to get swallowed somewhere on the way. I've not tried to ascertain who is to blame; perhaps it's the autoconnection of rfcomm?
  • udev, thinkpad docks, sawfish

    The other day someone gave me another dock for my thinkpad, and I eventually decided to use it at home. I've had a dock at the office for a long time, and docking there involved running a complex script configuring the network environment, running a window manager on some display on a desktop machine, and only exiting when the dock was supposed to end; its execution was triggered when the wake-up script noticed a dock was present.

    Now, when there are two docks and one is for rather conventional at-home use (essentially, simply configuring a different monitor and network adapter), I decided to do it properly and go through udev. Which turned out to be tricky enough that I'll turn this note to my future self into a blog post.

    udev

    What turned out to be the most complicated part was figuring out the udev rules. That's because for ages I have been using:

    udevadm info -a -p some/sysfs/path
    

    to work out matchable attributes for a device. That's more or less fine if all you're after is rules for attaching devices. For the dock, however, the removal event is important, too. But when the removal event comes in, udev has forgotten essentially all of the attributes that come from info -a, and rules that work with add simply won't fire with remove.

    So, here's my new policy: I'll use:

    udevadm monitor --environment --udev
    

    (where the udev option restricts events to udev rather than kernel events, which for the deluge of events coming from the dock seemed smart; I may want to revisit that). When you then plug in or out things, you'll directly see what events you can match against. Nice.

    Except of course for the deluge of events just mentioned: A dock just has quite a few devices. The event for the device I consider most characteristic, however, makes two add events, and I've not found a good way to tell the two of them apart. Still, this is what I've put into /etc/udev/rules.d/95-docking.rules:

    ACTION=="add", SUBSYSTEM=="usb", ENV{ID_VENDOR_ID}=="17ef", \
      ENV{ID_MODEL_ID}=="1010",  ENV{DEVTYPE}=="usb_device", \
      RUN+="/bin/su <your user id> -c '/full-path-to-dock-script start'"
    
    ACTION=="remove",  ENV{SUBSYSTEM}=="usb", ENV{PRODUCT}=="17ef/1010/5040", \
      ENV{DEVTYPE}=="usb_device", \
      RUN+="/bin/su <your user id> -c '/full-path-to-dock-script stop'"
    

    Important (and having forgotten about it again gave me quite a bit of frustration): Rather sensibly, udev has no idea of the shell path and will just fail silently when it cannot execute what's in RUN. Hence you must (essentially) always give full path names in udev RUN actions. In case of doubt, try RUN+="/usr/bin/logger 'rule fires'" in a rule and watch the syslog.

    For this kind of thing, by the way, you'll rather certainly want to use su (or go through policykit, but I can't bring mayself to like it). You see, I want the dock script in my home directory and owned by me; having such a thing be executed as root (which udev does) would be a nice backdoor for emergencies, but will generally count as a bad idea.

    On the double dock event… well, we're dealing with shell scripts here, so we'll hack around it.

    Dock script: sawfish to the rescue

    udev only lets you execute short scripts these days and rigorously kills everything spawned from udev rules when it has finished processing the events. I suppose that's a good policy for general system stability and reducing unpleasant surprises. But for little hacks like the one I'm after here, it seems to be a bit of a pain at first.

    What it means in practice is that you need something else to execute the actual dock script. In my case, that thing is my window manager, sawfish, and having the window manager do this is rather satisfying, which reinforces my positive feeling towards udev's kill policy (although, truth be told, the actual implemenation is in shell rather than in sawfish's scheme).

    To keep everything nicely together, the docking script at its core is a bash case statement, in essence:

    !/bin/bash
    # bookkeeping: we need to undock if that file is present
    UNDOCK_FILE=~/.do-undock
    
    # display for the window manager we talk to
    export DISPLAY=:0
    
    case $1 in
      start)
        sawfish-client -c "(system \"urxvt -geometry -0+0 -e $0 on &\")"
        ;;
      stop)
        sawfish-client -c "(system \"urxvt -geometry -0+0 -e $0 off &\")"
        ;;
      on)
        if [[ -f $UNDOCK_FILE &&
          $((`date +"%s"` - `date -r $UNDOCK_FILE +"%s"`)) -lt 20 ]]; then
            # debounce multiple dock requests
           exit 1
        fi
        touch $UNDOCK_FILE
    
        # let udev do its thing first; we're no longer running from udev
        # when we're here.
        udevadm settle
    
        # Commands to dock follow here
        ;;
      off)
        if [ -f ~/.do-undock ]; then
          rm ~/.do-undock
          # Commands to undock in here.
        fi
        ;;
      *)
        echo "Usage $0 (start|stop|on|off)"
        ;;
    esac
    

    The plan is: Udev calls the script with start and stop, which then arranges for sawfish to call the script from sawfish with the on and off arguments.

    There's a bit of bookkeeping with a file in home I keep to see whether we need to undock (in my setup, that's not necessary at work), and which I use to unbounce the duplicate dock request from udev. That part could be improved by using lockfile(1), because the way it is written right now there are race conditions (between the -f, the date, and the touch) – perhaps I'll do it when next I have time budgeted for OS fiddling.

    One think I like a lot is the udevadm settle; this basically lets my script rely on a defined state where all the devices it may want to talk to are guaranteed to be initialised as far as udev goes. This is so much nicer than that sleep 3 you can see in too many places.

    What I do when docking

    Why go into all this trouble and not let whatever automagic is active pick up the screen and the new network interface and be done with it? Well, partly because I don't run most of the software that does that magic. But of course having a shell script lets me say what I actually want:

    • disable sleep on lid closing (that's special to my own ACPI hacks from the depths of time)
    • configure the the external screen as primary (that's something like xrandr --output DP2-1 --off ; xrandr --fb 2048x1152 --output DP2-1 --auto for me; don't ask why I first need to switch off the display, but without it the --auto will get confused).
    • switch to an empty (“dock-only” if you will) page on the desktop (that's wmctrl -o 4096,1152 for me).
    • sure enough, switch on some desktop glitz that I'm too stingy for when off the grid.

    One thing I'm currently doing in the dock script that I shouldn't be doing there: I'm now using a wacom bamboo pad I've inherited as a mouse replacement at home and was suprised that no middle mouse button (Button2) was configured automatically on it. Perhaps some search engine will pick this up and save a poor soul looking for a quick solution reading man pages and xsetwacom output:

    xsetwacom set "Wacom BambooPT 2FG 4x5 Pad pad" Button 8 2
    xsetwacom set "Wacom BambooPT 2FG 4x5 Pad pad" Button 9 2
    

    (this makes both buttons in the middle middle mouse buttons; I'll see if I like that on the long run). Of course, in the end, these lines belong into a udev rule for the wacom tablet rather than a dock script. See above on these.

  • Upgraded to bullseye

    I've upgraded my personal notebook – with a file system that has a continuous history back to a slackware in 1996 and thus always is the most rewarding thing to test upgrades on – to Debian bullseye today.

    It's been a while since the last dist-upgrade messed up my X or rendered a machine unbootable, but they still never fail to be exciting. This one was no exception.

    logind and systemd

    The one major trouble as far as Debian proper is concerned was that the full-upgrade pulled in systemd again (which I still can't get myself to adopt on boxes I fiddle with a lot). This was because at some point I had purged elogind (which doesn't do anything on this box), and both a few KDE programs I have installed and the indispensable gparted need policykit-1, which in turn pulls in some logind; if you don't have one, apt will migrate you to systemd.

    Well, it was easy to go back this time: Just say:

    apt install elogind
    apt install sysvinit-core
    

    and all is back to normal again with the wonderful shell script goo during system startup. Hooray for elogind! I will admit I've not quite figured out what its actual use is, though. But that's probably because I've never quite figured out policykit, which in turn is probably because I think there's nothing wrong with sudo. But, as you'll see in a second, I might be swayed.

    Sure enough: Power

    I'm running all kinds of interesting scripts when the machine goes through various power states; for instance, I'm kill -STOP-ing a few pieces of moving eye candy when the box loses grid power, and I'm kill -CONT-ing them when the power is back. This happens through shell scripts I've dropped into /etc/pm/power.d, from where pm-utils has nicely been executing them for the last 10 years or so.

    Alas, they didn't run any more after the upgrade. Instead, when I shut the lid, the box would sleep right again after waking up. This last thing was fixed quickly: Just tell elogind not to bother in /etc/elogind/logind.conf.

    That the pre-sleep and post-wakeup scripts still ran soothed my first worry – that pm-utils might have had an RC- (release critical) bug and dropped out of Debian. Going through pm-utils' /usr/share/doc info made me worry again, though: the last upstream change there is from 2010, and the last Debian changelog entry is from 2019, mentioning an open RC bug. Uh-oh. It seems I might soon need to try harder with elogind.

    But not just yet, as the trace to work this out was bug #772275 (oh yes, the bug page for pm-utils makes we worry, too): pm-utils used to receive the AC/Battery notification from acpi-support, and that clearly broke in some way. At least for me, and with this upgrade. Poking around a bit in /etc/apci didn't show an immediate hook; yes, there's power.sh, but that gets called a lot on my box if the moon is right (for Lenovo's crappy firmware at least), and one would need to figure out whether or not there's grid power oneself.

    So, I dug a bit deeper and noticed that ever since I've moved from laptop-mode-tools to tlp, pm-utils were almost obsolete because tlp actually does everything it does all without pm-utils – but it doesn't let me run my beloved shell scripts (“by design“, its authors say). Hence, it's not byebye to pm-utils yet.

    But I like the way that tlp uses to be notified of power events: through udev. Taking that thought a bit further so I don't have to do any manual state management (as pm-utils doesn't have the equivalent of tlp auto) and filter out power events for batteries (which I don't care about), I ended up introducing two new udev rules that look relatively generic to me:

    ACTION=="change", SUBSYSTEM=="power_supply", ATTR{type}=="Mains",\
      ATTR{online}=="1", RUN+="/usr/sbin/pm-powersave false"
    ACTION=="change", SUBSYSTEM=="power_supply", ATTR{type}=="Mains",\
      ATTR{online}=="0", RUN+="/usr/sbin/pm-powersave true"
    

    Drop this into /etc/udev/rules.d/10local.rules (or so), and pm-utils' power.d works again.

    Another python2 grace time

    But the real elephant in the room is that bullseye in effect drops Python version 2. While this certainly does not come as a surprise, it still hurts me a lot, because I have plenty of self-written larger or smaller python2 programs – my audiobook-reader, my local wikipedia, my work time accounting and a gazillion little other things. And there's things like editmoin that haven't been ported yet either.

    Well, I had hoped I could keep the buster python2 packages around, perhaps even using the python-is-python2 package. But really, I don't think that's an option for a halfway lively system (which will use quite a few python3 packages). I gave up on that idea more or less as soon as I realised that the python-docutils-common dependency (and docutils I need left and right) will conflict between the docutils from buster and from bullseye. Trying to keep buster packages will clearly become incredibly fiddly.

    So, instead I figured I ought to keep the legacy software alive while finally porting it as I go along (one, my one-line CLI, I actually have ported this morning) using a python2 “virtual” (yeah, right, virtual...) environment.

    Yes, virtual environments are evil all around, not only because their content rots without anyone noticing; but then this is exactly about letting things rot in a halfway controlled fashion, so I claim this is a use case.

    In case others mourn the demise of python2 in bullseye and want to go slowly when migrating, here's what to do:

    1. Make sure the python2 packages that still are in bullseye are in place. This would be python2.7, python2.7-dev, and presumably python-tk. Of course, you will want the virtualenv package, but that's already python3.

    2. Create the virtual environment:

      virtualenv -p python2.7 ~/.legacy-python
      
    3. Make it simple to use that. For that, add:

      alias enable-oldpython='export PATH=~/.legacy-python/bin:$PATH'
      

      to your .aliases (or whereever else you keep your aliases) and exec bash in the current shell to try that out. This is when you want want to run pip, or at any other time when you want your python to be what's in the virtual environment.

      But this won't work for hashbangs. To make that work, put a file like:

      #!/bin/sh
      export PATH=/home/<YOUR USERNAME>/.legacy-python/bin/:$PATH
      exec python "$@"
      

      somewhere into your path as, say, oldpython. Since I still have some system-wide things using python2, I ended up sticking this into /usr/local/bin. Given python2 has been out of security support for more than a year now, I might be regretting that; on the other hand, python's core hasn't had many security problems in the past 20 years, and so I figure I am fine. Caveat emptor, though.

    4. Then, run pip install and/or python setup.py install to your heart's delight. All this isn't forever, so this one time I found myself capable of forgetting the long run, later upgrades, and all that. Just remember: sudo and pip never mix, and they particularly badly mix here. Oh: I actually found myself apt-get source-ing python packages from buster and just running python setup.py install in them because in my book that's a good deal less obscure than pip.

    Limping along with a private MoinMoin

    But then came the bitter realisation: There's no moinmoin in bullseye any more. That's a killer for at least three servers I'm operating. And, really, looking at what the MoinMoin folks write on python3 (in particular at its list of dependencies), I shudder in the expectation of seeing something quite in line with my unpleasant experiences with mailman2 happen with MoinMoin.

    On my box, however, I can live with an aging service (that only listens to localhost), and I can live with having moinmoin be a CGI. In case these considerations (typically, for a “notes and observations”-style wiki) apply to you as well, here's what I did to make the legacy moinmoin run in my bullseye apache2.

    First, I installed moinmoin into the “virtual” python 2.7 I created above:

    enable-oldpython  # the alias above
    pip install moin  # no sudo!
    

    Then I fixed the apache configuration to use that. So, I commented out the previous MoinMoin integration and replaced it with something like:

    <Directory /home/<YOUR USER NAME>/.legacy-python/share/moin/server>
      AllowOverride None
      Options +ExecCGI -MultiViews +SymLinksIfOwnerMatch
      Require all granted
    </Directory>
    <Directory /home/<YOUR USER NAME>/.legacy-python/lib/python2.7/site-packages/MoinMoin/web/static/htdocs>
      AllowOverride None
      Require all granted
    </Directory>
    
    ScriptAlias /meiner /home/<YOUR USER NAME>/.legacy-python/share/moin/server/moin.cgi
    Alias /wiki/ /home/<YOUR USER NAME>/.legacy-python/lib/python2.7/site-packages/MoinMoin/web/static/htdocs
    Alias /moin_static1911 /home/<YOUR USER NAME>/.legacy-python/lib/python2.7/site-packages/MoinMoin/web/static/htdocs
    

    (you may need …

  • A Mail Server on Debian

    After decades of (essentially) using other people's smarthosts to send mail, I have recently needed to run a full-blown, deliver-to-the-world mail server (cf. Das Fürchten lernen; it's in German, though).

    While I had expected this to be a major nightmare, it turns out it's not so bad at all. Therefore I thought I'd write up a little how-to-like thing – perhaps it will help someone to set up their own mail server. Which would be a good thing. Don't leave mail to the bigshots, it's too important for that.

    Preparation

    You'll want to at least skim the exim4 page on the Debian wiki as well as /usr/share/doc/exim4/README.Debian.gz on your box. Don't worry if any of that talks about things you've never heard about at this point and come back here.

    The most important thing to work out in advance is to get your DNS to look attractive to the various spam estimators; I didn't have that (mostly because I moved “secondary” domains first), which caused a lot of headache (that article again is in German).

    How do you look attractive? Well, in your DNS make sure the PTR for your IP is to mail.<your-domain>, and make sure mail.<your-domain> exists and resolves to that IP or a CNAME pointing there. Note that this means that you can forget about running a mail server on a dynamic IP. But then dynamic IPs are a pain anyway.

    Before doing anything else, wait until the TTL of any previous records of this kind has expired. Don't take this lightly, and if you don't unterstand what I've been saying here, read up on DNS in the meantime. You won't have much joy with your mail server without a reasonable grasp of reverse DNS, DNS caching, and MX records.

    Use the opportunity to set the TTL of the MX record(s) for your domain(s) to a few minutes perhaps. Once you have configured your mail system, you can then quickly change where other hosts will deliver their mail for your domain(s) and raise the TTLs again.

    Exim4

    Debian has come with the mail transfer agent (MTA; the mail server proper if you will) exim4 installed by default for a long, long time, and I've been using it on many boxes to feed the smart hosts for as long as I've been using Debian. So, I'll not migrate to something else just because my mail server will talk to the world now. Still, you'll have to dpkg-reconfigure exim4-config. Much of what's being asked by that is well explained in the help texts. Just a few hints:

    • “General type of mail configuration” would obviously be “internet site“.
    • Mail name ought to be <your domain>; if you have multiple domains, choose the one you'd like to see if someone mails without choosing any.
    • Keep the IP addresses to listen on empty – you want other hosts to deliver mail on port 25. Technically, it would be enough to listen only on the address your MX record points to, but that's a complication that's rarely useful.
    • Relaying mail for non-local domains is what you want if you want to be a smart host yourself. You'll pretty certainly want to keep this empty as it's easy to mess it up, and it's easy to configure authenticated SMTP even on clients (also see client connections on avoiding authenticated SMTP on top).
    • Exim also is a mail delivery agent (MDA), i.e., something that will put mail for domains it handles into people's mail boxes. I'll assume below that you select Maildir format in home directory as the delivery method. Maildir is so much cooler than the ancient mboxes, and whoever wants something else can still use .forward or procmail.
    • And do split your configuration into small files. Sure, you'll have to remember to run update-exim4.conf after your edits, but that litte effort will be totally worth it after your next dist-upgrade, when you won't have to merge the (large) exim4 config file manually and figure out what changes you did where.

    DNS Edits

    With this, you should be in business for receiving mail. Hence, make your MX record point to your new mail server. In an NSD zone file (and NSD is my choice for running my DNS server), this could look like:

    <your domain>.  IN MX 10 <your domain>.
    

    (as usual in such files: Don't forget the trailing dots).

    A couple of years ago, it was all the craze to play games with having multiple MX records to fend off spam. It's definitely not worth it any more.

    While I personally think SPF is a really bad idea, some spam filters will regard your mail more kindly if they find an SPF record. So, unless you have stronger reasons to not have one than just “SPF is a bad concept and breaks sane mailing list practices, .forward files and simple mail bouncing”, add a record like this:

    <your domain>.                3600    IN      TXT     "v=spf1" "+mx" "+a" "+ip4:127.0.0.1" "-all"
    

    – where you have to replace the 127.0.0.1 with your IP and perhaps add a similar ip6 clause. What this means: Mail coming from senders in <your domain> ought to originate at the IP(s) given, and when it comes from somewhere else it's fishy. Which is why this breaks good mailing list practices. But forunately most spam filters know that and don't interpret these SPF clauses to narrow-mindedly.

    SSL

    I'm not a huge fan of SSL as a base for cryptography – X.509 alone is scary and a poor defense against state actors –, but since it's 2021, having non-SSL services doesn't look good. Since it's important to look good so people accept your mail, add SSL to your exim installation.

    Unless you can get longer-living, generally-trusted SSL certificates from somewhere else, use letsencrypt certificates. Since (possibly among others) the folks from t-online.de want to see some declaration who is behind a mail server on such a web site, set up a web server for mail.<your-domain> and obtain letsencrypt SSL certificates for them in whatever way you do that.

    Then, in the post-update script of your letsencrypt updater, run something like:

    /bin/cp mail.crt mail.key /etc/exim4/ssl/
    /usr/sbin/service exim4 restart
    

    (which of course assumes that script runs as root or at least with sufficient privileges). /etc/exim4/ssl you'll have to create yourself, and to keep your key material at least a bit secret, do a:

    chown root:Debian-exim /etc/exim4/ssl
    chmod 750 /etc/exim4/ssl
    

    – that way, exim can read it even if it's already dropped its privileges, but ordinary users on your box cannot.

    Then tell exim about your keys. For that, use some file in /etc/exim4/conf.d/main; such files are the main way of configuring the exim4 package in non-trivial ways. I have 00_localmacros, which contains:

    MAIN_TLS_ENABLE = yes
    MAIN_TLS_CERTIFICATE = /etc/exim4/ssl/mail.crt
    MAIN_TLS_PRIVATEKEY = /etc/exim4/ssl/mail.key
    

    – that ought to work for you, too.

    Then, do the usual update-exim4.conf && service exim4 restart, and you should be able to speak SSL with your exim. The easiest way to test this is to install the swaks package (which will come in useful when you want to run authenticated SMTP or similar, too) and then run:

    swaks -a -tls -q HELO -s mail.<your domain> -au test -ap '<>'
    

    This will spit out the dialogue with your mail server and at some point say 220 TLS go ahead or so if things work, some more or less helpful error message if not.

    Aliases

    Exim comes with the most important aliases (e.g., postmaster) pre-configured in /etc/aliases. If you want to accept mail for people not in your /etc/passwd, add them there.

    The way this is set up, exim ignores domains; if you told exim to accept mails for domain1.de and domain2.fi, then mail to both user@domain1.de and user@domain2.fi will end up in /home/user/Maildir (or be rejected if user doesn't exist and there's no alias either). If you want to have domain-specific handling, add a file /etc/exim4/forwards that contains pairs like:

    drjekyll@example.org: mrhyde@otherexample.org
    

    The standard Debian configuration of Exim will not evaluate this file; to make it do that, drop a file wil something like:

    # Route using a global incoming -> outgoing alias file
    
    global_aliases:
      debug_print = "R: global_aliases for $local_part@$domain"
      driver = redirect
      domains = +local_domains
      allow_fail
      allow_defer
      data = ${lookup{$local_part@$domain}lsearch{/etc/exim4/forwards}}
    

    into (say) /etc/exim4/conf.d/router/450_other-aliases. After the usual update-exim4.conf, you should be good to go.

    Client Connections

    This setup only accepts mail for transport locally, and it will only deliver locally. That is: This isn't a smarthost setup yet.

    For delivery from remote systems, we're using ssh because pubkey auth is cool. This even works from an exim on the remote system …

  • Perhaps I should be moving to gentoo

    I'm reading PDFs quite a bit, most of them in my beloved zathura. So, I was dismayed when today I paged through a book that showed in zathura as on the left side of this figure:

    Renderings of a PDF in poppler and mupdf.

    The metrics are off so badly that readability suffers.

    Rather than try to fix the PDF, I remembered I had for a long time wanted to look into using mupdf as a backend for zathura rather than its default poppler, if only because poppler used to have a shocking amount of rather serious bugs a couple of years ago (now that I think of it: It's been a while since I last heard of any – hm).

    Bringing up the PDF in mupdf looked a lot better (the right panel in the above figure). Which then led to a bout of yak shaving, because there is a plugin for zathura that promised to do what I wanted, zathura-pdf-mupdf, but of course nobody has bothered to package it up for Debian yet. So… let's try to build it.

    It's probably not a good sign that the package's README talks about make to build the thing, whereas the web page talks about a build system with commands meson and ninja (that, frankly, I had never heard about before, but at least it's in Debian). But, never mind, let's do meson build && cd build && ninjia (oh wow).

    Of course, building fails with something like:

    ../zathura-pdf-mupdf/index.c: In function ‘build_index’:
    ../zathura-pdf-mupdf/index.c:68:7: error: unknown type name ‘fz_location’; did you mean ‘fz_catch’?
           fz_location location = fz_resolve_link(ctx, document, outline->uri, &x, &y);
           ^~~~~~~~~~~
    

    A quick web search shows that this fz_location is part of the mupdf API and has indeed undergone an API change. So, I backported libmupdf from Debian testing (I'm on Debian stable almost always), and because that needs mujs, I backported that, too. Mujs sounds a lot like javascript in PDF, and that's where I first think gentoo: with its USE flags it would proabably make it easier to just keep javascript out of my PDF rendering engines altogether. Which is something I'd consider an excellent idea.

    Anyway, with a bit of hacking around – I don't have a libmupdf-third library that the meson build file mentions but perhaps doesn't need any more – I then got the plugin to build.

    Regrettably, zathura still would not use mupdf to render, saying:

    error: Could not load plugin '/usr/lib/i386-linux-gnu/zathura/libpdf-mupdf.so'
    (/usr/lib/i386-linux-gnu/zathura/libpdf-mupdf.so:
    undefined symbol: jpeg_resync_to_restart).
    

    Again asking a search engine about typical scenearios that would lead to this failure when loading a plugin, there's quite a bit of speculation, one of it being about using libjpeg-turbo instead of libjpeg. Which made me see what this plugin links again. Fasten your seat belts:

    $ ldd /usr/lib/i386-linux-gnu/zathura/libpdf-mupdf.so
            linux-gate.so.1 (0xf7fa7000)
            libgirara-gtk3.so.3 => /usr/lib/i386-linux-gnu/libgirara-gtk3.so.3 (0xf5d23000)
            libcairo.so.2 => /usr/lib/i386-linux-gnu/libcairo.so.2 (0xf5bd3000)
            libglib-2.0.so.0 => /usr/lib/i386-linux-gnu/libglib-2.0.so.0 (0xf5a9a000)
            libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xf58bc000)
            libgtk-3.so.0 => /usr/lib/i386-linux-gnu/libgtk-3.so.0 (0xf50bc000)
            libgdk-3.so.0 => /usr/lib/i386-linux-gnu/libgdk-3.so.0 (0xf4fae000)
            libpango-1.0.so.0 => /usr/lib/i386-linux-gnu/libpango-1.0.so.0 (0xf4f5f000)
            libgio-2.0.so.0 => /usr/lib/i386-linux-gnu/libgio-2.0.so.0 (0xf4d57000)
            libgobject-2.0.so.0 => /usr/lib/i386-linux-gnu/libgobject-2.0.so.0 (0xf4cf2000)
            libjson-c.so.3 => /usr/lib/i386-linux-gnu/libjson-c.so.3 (0xf4ce5000)
            libpthread.so.0 => /lib/i386-linux-gnu/libpthread.so.0 (0xf4cc2000)
            libpixman-1.so.0 => /usr/lib/i386-linux-gnu/libpixman-1.so.0 (0xf4c12000)
            libfontconfig.so.1 => /usr/lib/i386-linux-gnu/libfontconfig.so.1 (0xf4bc5000)
            libfreetype.so.6 => /usr/lib/i386-linux-gnu/libfreetype.so.6 (0xf4b02000)
            libpng16.so.16 => /usr/lib/i386-linux-gnu/libpng16.so.16 (0xf4ac3000)
            libxcb-shm.so.0 => /usr/lib/i386-linux-gnu/libxcb-shm.so.0 (0xf4abe000)
            libxcb.so.1 => /usr/lib/i386-linux-gnu/libxcb.so.1 (0xf4a90000)
            libxcb-render.so.0 => /usr/lib/i386-linux-gnu/libxcb-render.so.0 (0xf4a81000)
            libXrender.so.1 => /usr/lib/i386-linux-gnu/libXrender.so.1 (0xf4a75000)
            libX11.so.6 => /usr/lib/i386-linux-gnu/libX11.so.6 (0xf4926000)
            libXext.so.6 => /usr/lib/i386-linux-gnu/libXext.so.6 (0xf4911000)
            libz.so.1 => /lib/i386-linux-gnu/libz.so.1 (0xf48f0000)
            librt.so.1 => /lib/i386-linux-gnu/librt.so.1 (0xf48e5000)
            libm.so.6 => /lib/i386-linux-gnu/libm.so.6 (0xf47df000)
            libpcre.so.3 => /lib/i386-linux-gnu/libpcre.so.3 (0xf4766000)
            /lib/ld-linux.so.2 (0xf7fa9000)
            libgmodule-2.0.so.0 => /usr/lib/i386-linux-gnu/libgmodule-2.0.so.0 (0xf4760000)
            libpangocairo-1.0.so.0 => /usr/lib/i386-linux-gnu/libpangocairo-1.0.so.0 (0xf4750000)
            libXi.so.6 => /usr/lib/i386-linux-gnu/libXi.so.6 (0xf473d000)
            libXcomposite.so.1 => /usr/lib/i386-linux-gnu/libXcomposite.so.1 (0xf4739000)
            libXdamage.so.1 => /usr/lib/i386-linux-gnu/libXdamage.so.1 (0xf4734000)
            libXfixes.so.3 => /usr/lib/i386-linux-gnu/libXfixes.so.3 (0xf472d000)
            libcairo-gobject.so.2 => /usr/lib/i386-linux-gnu/libcairo-gobject.so.2 (0xf4721000)
            libgdk_pixbuf-2.0.so.0 => /usr/lib/i386-linux-gnu/libgdk_pixbuf-2.0.so.0 (0xf46f4000)
            libatk-1.0.so.0 => /usr/lib/i386-linux-gnu/libatk-1.0.so.0 (0xf46cb000)
            libatk-bridge-2.0.so.0 => /usr/lib/i386-linux-gnu/libatk-bridge-2.0.so.0 (0xf4693000)
            libxkbcommon.so.0 => /usr/lib/i386-linux-gnu/libxkbcommon.so.0 (0xf464d000)
            libwayland-cursor.so.0 => /usr/lib/i386-linux-gnu/libwayland-cursor.so.0 (0xf4644000)
            libwayland-egl.so.1 => /usr/lib/i386-linux-gnu/libwayland-egl.so.1 (0xf463f000)
            libwayland-client.so.0 => /usr/lib/i386-linux-gnu/libwayland-client.so.0 (0xf4630000)
            libepoxy.so.0 => /usr/lib/i386-linux-gnu/libepoxy.so.0 (0xf451e000)
            libharfbuzz.so.0 => /usr/lib/i386-linux-gnu/libharfbuzz.so.0 (0xf4407000)
            libpangoft2-1.0.so.0 => /usr/lib/i386-linux-gnu/libpangoft2-1.0.so.0 (0xf43ee000)
            libXinerama.so.1 => /usr/lib/i386-linux-gnu/libXinerama.so.1 (0xf43e7000)
            libXrandr.so.2 => /usr/lib/i386-linux-gnu/libXrandr.so.2 (0xf43da000)
            libXcursor.so.1 => /usr/lib/i386-linux-gnu/libXcursor.so.1 (0xf43cd000)
            libthai.so.0 => /usr/lib/i386-linux-gnu/libthai.so.0 (0xf43c1000)
            libfribidi.so.0 => /usr/lib/i386-linux-gnu/libfribidi.so.0 (0xf43a5000)
            libdl.so.2 => /lib/i386-linux-gnu/libdl.so.2 (0xf439f000)
            libmount.so.1 => /lib/i386-linux-gnu/libmount.so.1 (0xf4333000)
            libselinux.so.1 => /lib/i386-linux-gnu/libselinux.so.1 (0xf4306000)
            libresolv.so.2 => /lib/i386-linux-gnu/libresolv.so.2 (0xf42ec000)
            libffi.so.6 => /usr/lib/i386-linux-gnu/libffi.so.6 (0xf42e2000)
            libexpat.so.1 => /lib/i386-linux-gnu/libexpat.so.1 (0xf42a5000)
            libuuid.so.1 => /lib/i386-linux-gnu/libuuid.so.1 (0xf429b000)
            libXau.so.6 => /usr/lib/i386-linux-gnu/libXau.so.6 (0xf4296000)
            libXdmcp.so.6 => /usr/lib/i386-linux-gnu/libXdmcp.so.6 (0xf428f000)
            libdbus-1.so.3 => /lib/i386-linux-gnu/libdbus-1.so.3 (0xf4230000)
            libatspi.so.0 => /usr/lib/i386-linux-gnu/libatspi.so.0 (0xf41fb000)
            libgraphite2.so.3 => /usr/lib/i386-linux-gnu/libgraphite2.so.3 (0xf41cd000)
            libdatrie.so.1 => /usr/lib/i386-linux-gnu/libdatrie.so.1 (0xf41c3000)
            libblkid.so.1 => /lib/i386-linux-gnu/libblkid.so.1 (0xf4163000)
            libbsd.so.0 => /usr/lib/i386-linux-gnu/libbsd.so.0 (0xf4144000)
            libsystemd.so.0 => /lib/i386-linux-gnu/libsystemd.so.0 (0xf4099000)
            liblzma.so.5 => /lib/i386-linux-gnu/liblzma.so.5 (0xf406d000)
            liblz4.so.1 => /usr/lib/i386-linux-gnu/liblz4.so.1 (0xf404d000)
            libgcrypt.so.20 => /lib/i386-linux-gnu/libgcrypt.so.20 (0xf3f6a000)
            libgpg-error.so.0 => /lib/i386-linux-gnu/libgpg-error.so.0 (0xf3f45000)
    

    Now, I appreciate that glueing two pieces of relatively complex code together can be a bit involved, but: 69 libraries!? Among them Xrandr, Xinerama, wayland (which I don't use), systemd (even if I used it: what would that plugin have to do with it?), gpg-error, selinux, and then some things I've never heard about.

    I'm sorry, but this is wrong. Which makes me think hard if gentoo's USE flags might not really be the way to go in this day and age of exploding dependencies.

    Holy cow.

    In case you came here from a search engine that hit on one of the error messages: After writing this, I was hungry and let it sit. The one thing I can tell you is that the elusive jpeg_resync_to_restart is in libjpeg, indeed. What I don't know yet is why that library hasn't made it into the heap of libraries that the plugin links to.

    I'd suspect that zathura and mupdf are built against different libjegs – but then I'd have to explain how that would have happened. Hm.

    Nachtrag (2021-02-13)

    Well, I couldn't let it sit, so here's what I needed to do (and I suspect I'd have found it in one of the upstream branches):

    1. The libjpeg thing really is that the libjpeg library needs to be linked into the plugin in Debian; the details I can't quite work out, because I'd say the linker should be able to work that out, but clearly it's not, because the situation is actually being cared for in the plugin's meson file. However, you need to manually flip a switch: this would be a ./configure run in autoconf, but here, the command line is:

      meson setup --wipe -D link-external=true  build
      
    2. However, one link-time dependency is missing with the mupdf from Debian bullseye, and that's the spooky mujs. To fix this, patch the build file like so:

      diff --git a/meson.build b/meson.build
      index 23cdc6a..24929ca 100644
      --- a/meson.build
      +++ b/meson.build
      @@ -19,8 +19,8 @@ zathura = dependency('zathura', version: '>=0.3.9')
       girara = dependency('girara-gtk3')
       glib = dependency('glib-2.0')
       cairo = dependency('cairo' …

Seite 1 / 1

Letzte Ergänzungen