Tag Luakit

  • How to Disable pdf.js in Webkit on Debian

    A window of the zathura PDF viewer showing the GDPR.

    This is how I want my PDFs rendered. And I want a j to scroll down a bit. That pdf.js fails on both accounts is just the first two of its defects.

    When I upgraded to Debian bookworm, I noticed with great dismay that the webkit browser engine it comes with has a pdf.js-based PDF renderer built in.

    That means that my preferred browser, luakit, is basically broken when dealing with PDFs: where I disable Javascript (i.e., by default), I see nothing at all. Where I allow Javascript, my PDFs appear in a UI I consider rather nasty. On top of that, I lose the nice archive of PDFs I've recently read that came with luakit's viewpdf extension. That holds true even if I do manage to properly open the PDF in my preferred renderer (zathura) using pdf.js's Save, as that blindly calls all PDFs “document.pdf”.

    Regrettably, there doesn't seem to be a runtime switch to turn off the in-browser PDF rendering. After poking around a bit in webkit's source code, I have convinced myself that I won't add that switch myself. I am just not desperate enough to start hacking on one of the major browser engines.

    But there is a build-time switch to turn pdf.js off. I have always shied away from building my own webkit packages because there's so horribly much code and C++ compilers are so terribly resource-hungry. But my suffering with the pdf.js disaster has reached a level that made me overcome that horror. So, here's how to build a Webkit such that browsers based on it will again handle PDFs properly (sc. by handing them over to the system). All this is for Debian bookworm and derivatives; let's hope it won't be necessary beyond that.

    1. Get the source:

      mkdir -p src/webkit
      cd src/webkit
      apt-get source webkit2gtk
      cd webkit2gtk*

      This will only work if you have configured a source repo for your suite in your /etc/apt/sources.list (or equivalent) and run apt update after that.

      This is pulls in about 50 Megabytes, which in itself is an argument in favour of netsurf. But these 50 Megs are peanuts compared to what's coming: by the time you've done a full build, this directory will have exploded into more than 3 GB (in i386). Let's fix the web so plain browsing doesn't require such monsters.

    2. Configure your build. Fortunately, you mostly only touch the debian/rules file. In there, change:


      to (presumably):


      That's for luakit that is built on top of soup2; if your browser uses a different API, make a different choice here. Each build takes forever and gobbles up about 3 Gigs in the process, so be stingy here.

      Then, locate the line -DENABLE_MINIBROWSER=ON (which currently concludes the EXTRA_CMAKE_ARGUMENTS) and change it to:


      Disabling the journald log is not strictly necessary, but it helps building on non-systemd boxes, and I doubt it actually hurts anyone.

      Nachtrag (2024-01-21)

      At least with 2.40.3, this procedure ends in a:

      dh_install: error: missing files, aborting

      presumably because we are not building for two APIs. I think that's a bug, but from dh_install's manpage I cannot even understand why it thinks it should fail because of missing files, and consequently futzing around with debian/not-installed or the various options went nowhere. Because I'm really grumpy with the whole state of affairs, I quickly resigned into simply emptying all debian/*.install files not pertinent to the packages I want to build.

    3. Remove the systemd build dependency. We can do that because we have just disabled the JOURNALD_LOG. So, in debian/control, delete the line:

      libsystemd-dev [linux-any],
    4. Install the build dependencies:

      sudo apt-get build-dep webkit2gtk

      On non-systemd boxes, this will say something like:

      libelogind0 : Conflicts: libsystemd0

      because you have not removed the libsystemd dependency from apt's database in step (3), and webkit at this point doesn't know it could build with libelogind0-dev, too. Don't worry about it as long as all the other build-dependencies came in.

    5. Make a changelog entry so your system knows your build is “newer” than Debian's and you can later tell it's your custom build:

      dch -i

      You probably want to put something like “rebuild with PDFJS disabled“ in there, but that's exclusively for your own comfort unless you start distributing your package.

    6. Do the build:

      dpkg-buildpackage -j6 -b -uc -us -rfakeroot

      Do that on a cold day, because this will turn your machine into a space heater for several hours (unless you have a very fast machine, in which case you probably don't need another space heater in the first place).

    7. When this is done, you will have about a dozen binary packages in the build directory's parent. You probably don't want to dpkg -i *.deb, as there's no point installing debug packages (for starters). For luakit, I've run this:

      sudo dpkg -i gir1.2-javascriptcoregtk-4.0_2.*.deb gir1.2-webkit2-4.0_2.*.deb libjavascriptcoregtk-4.0-18_2.*.deb libjavascriptcoregtk-4.0-bin_2.*.deb libjavascriptcoregtk-4.0-dev_2.*.deb libwebkit2gtk-4.0-37_2.*.deb libwebkit2gtk-4.0-dev_2.*.deb

      This could be a slight over-installation.

    By the way, in case the build fails somewhere in the middle but is fundamentally sane, you can resume it by calling:

    fakreroot debian/rules binary

    Doing dpkg-buildpackage as above resets the build and will discard everything the computer has built in perhaps hours.

    Given the extreme cost of building a webkit, getting pdf.js out in this way is not a long-term plan, at least if you want your webkit to be halfway up-to-date (which is a good idea in particular if you're indiscriminate as to who can execute Javascript in your browser). Until someone kindly implants a run-time switch, I'm going to shut out pdfjs-infested upgrades until some really, really unnerving (that is, even more unnerving than usual) webkit vulnerability surfaces. To do that, I'm dropping:

    # my webkit with patched-out pdfjs
    Package: libjavascriptcoregtk-4.0-18
    Pin: version 2.40.5-1~deb12u1.1
    Pin-Priority: 1001

    into /etc/apt/preferences.d/10pins (where your version will probably different; check the version tag in the names of the generated package files). That will make the messages from apt upgrade quite a bit uglier, and of course I'll have a webkit with published security bugs (you have been warned in case you're doing as I do). But in my book that's totally worth it just to get rid of the wretched pdf.js.

  • Browsing Peace and Privacy With dnsmasq

    Screenshot of the dnsmasq extra configuration page in freetz

    You can even have the DNS-based adblocking discussed here in your whole network if your router runs dnsmasq (it probably does) and you can edit its configuration (you probably can't). As shown here, with freetz you can.

    I'm not a big fan of in-browser adblocking. For one, I have my doubts about several of the extensions – Adblock plus, for instance, comes from a for-profit, though I give you this critique might be partisan. Also, I like to switch browsers freely and certainly don't want to maintain block lists for each of them, and finally quite a few clients other than browsers may render HTML and hence ads.

    At least with the pages I want (and don't want) to read, there's a much lighter alternative: DNS-based adblocking. You see, on the relatively few commercial pages I occasionally have reason to visit, ads, tracking pixels, and nasty javascript typically are served from a rather small set of domains – doubleclick.net, googleadservices.com, and a few more like these. If I can make my computer resolve these names to – that is, my computer in IPv4, or yours, if you type that address –, everything your browser would pull from these servers is instantly gone in everything rendering HTML.

    So, how do you do that? Well, you first make sure that your computer does the name resolution itself[1]. On Debian, you do that by installing the packages resolvconf (without a second e; in a systemd environment I think you want to use systemd-resolved instead) and dnsmasq; that's really all, and that ought to work out of the box in all reasonably common situations:

    $ sudo apt install resolvconf dnsmasq

    You will probably have to bring your network down and up again for this to take effect.

    Once that's done, you can tell dnsmasq what names to resolve to what. The man page dnsmasq(8) documents what to do under the --address option – you could actually configure dnsmasq through command line options exclusively –, where you can read:

    -A, --address=/<domain>[/<domain>...]/[<ipaddr>]

    Specify an IP address to return for any host in the given domains. […] A common use of this is to redirect the entire doubleclick.net domain to some friendly local web server to avoid banner ads. The domain specification works in the same was [sic, as of bullseye] as for --server […]

    – and from the documentation of --server you learn that <domain> is interpreted as a suffix (if you will), such that if you give an address for, say, google.com, it will also be used for foo.google.com or foo.bar.google.com.

    But where do these address expressions go? Well, at least in Debian, dnsmasq will read (essentially, see the README in there) any file you drop into /etc/dnsmasq.d and add its content to its configuration. Having configuration snippets in different files really helps maintenance and dist-upgrades in general; in this case, it also helps distributing the blacklist, as extra configuration that may be inappropriate on a different host is kept in some other file.

    I tend to prefix snippet names with numbers in case order might one day matter. So, I have a file /etc/dnsmasq.d/10spamreduce.conf containing:


    When you do the same thing, you should restart dnsmasq and then see the effect like this:

    $ sudo service dnsmasq restart
    $ dig +short fonts.gstatic.com

    As you can see, I have also included some trackers and other sources of annoyance in my address list. Of course, if you actually want to read Facebook (ugh) or need to pull Google's fonts (ughugh), you'll have to adapt that list a bit.

    In case you have interesting and useful contributions to this list: Please do write in!

    [1]Regrettably, with things like DNS over HTTPS, it could be that your browser actually will not use your computer's DNS resolver. Adblocking hence is one extra reason to disable DoH when you see it.
  • What to do when github eats 100% CPU in luakit

    I can't help it: As probably just about every other programming life form on this planet I have to be on github now and then. Curse the network effect and all those taking part in it (which would by now include me).

    Anyway, that's why the last iteration of luakit bug #972 (also on github. Sigh) bit me badly: as long as the browser is on a github page, it will spend a full 100% of a CPU on producing as many error messages as it can, each reading:

    https://github.githubassets.com/<alphabet soup>1:8116:
    CONSOLE JS ERROR Unhandled Promise Rejection:
    TypeError: undefined is not an object (evaluating 'navigator.clipboard.read')

    Github being a commercial entity I figured it's a waste of time trying to fill in a bug report. And the problem didn't fix itself, either.

    So, I went to fix it (in a fashion) with userscript. Since the problem apparently is that some github code doesn't properly catch a missing (or blacklisted) clipboard API in a browser (and I still consider blacklisting that API an excellent idea), I figured things should improve when I give github something similar enough to an actual clipboard. It turns out it does not need to be terribly similar at all. So, with a few lines of Javascript, while github still sucks, at least it doesn't eat my CPU any more.

    What do you need to do? Just create a userscript like this (for luakit; other browsers will have other ways):

    mkdir -p .local/share/luakit/scripts
    cat > .local/share/luakit/scripts/github.user.js

    Then paste the following piece of Javascript into the terminal:

    // ==UserScript==
    // @name          clipboard-for-github
    // @namespace     http://blog.tfiu.de
    // @description   Fix github's 100% CPU usage due to unhandled clipboard errors
    // @include       https://github.com*
    // ==/UserScript==
    navigator.clipboard = Object()
    navigator.clipboard.read = function() {
            return "";

    As usual with this kind of thing, at least have a quick glance at what this code does; these four lines of source code sufficient here at least are easy to review. Finish off with a control-D, go to a luakit window and say :uscripts-reload.

    If you then go to, say bug #972, your CPU load should stay down. Of course, as long as github blindly tries to use the navigator.clipboard object for „copy link“-type operations, these still won't work. But that's now github's problem, not mine.

    And anyway: Give up Github.

  • View with Netsurf

    A screenshot of a browser window

    An early version of this post rendered in netsurf.

    I believe about the worst threat to software freedom these days is web browsers. That is not only because they already are, for many people out there, a more relevant applications platform than their primary operating system, and that almost everything that gets run in them is extremely non-Free software. I've been linking to a discussion of this problem from these pages since this blog's day one as part of my quip on “best viewed with javascript disabled“.

    No, they are also a threat because the “major” browser engines are so humunguous that they are in effect locking out most platforms (which simply don't have enough power to run them). And they are because the sheer size and complexity of their code bases make it essentially impossible for an individual to fix almost any relevant bug in them related to rendering, javascript execution, or network interactions.

    That is why I am so grateful to the authors and maintainers of both dillo (Debian: dillo) and netsurf (Debian: netsurf-gtk, mainly), small browsers with maintainable code bases. While dillo is really basic and is missing so much of CSS and modern HTML that on today's web even many non-adversarial sites become barely usable, netsurf is usually just fine for websites respecting user rights.

    Flex layouts and the article elements: The good part of 20 years of web development after the Web 1.0.

    I have to admit I nevertheless only use it in very specific contexts, mostly because luakit with its vi-like key bindings and lua extensiblity in the end usually wins out even though I don't trust the webkit rendering engine for two cents[1]. And that is why I hadn't noticed that this blog has rendered a lot worse than it should have in netsurf. This is particularly shameful because that was mostly because I have taken liberties with web standards that I should not have taken. Apologies: Netsurf was right and I was wrong.

    I have improved that quite a bit this morning. Given I am using flex layouts quite liberally here, and these don't work in Debian stable's netsurf, the rendered pages do look quite a bit different in netsurf than on the “major” browsers. But the fallbacks are ok as far as I am concerned. Since flex layouts are among the few “innovations“ in the post-Web 1.0 ecosystem that are actually a good idea, I gladly accept these fallbacks. Let me stress again that it is a feature of a friendly web rather than a bug that pages look different in different user agents.

    Dillo, regrettably, is another matter because of the stupid^Wunderconsidered colour games I'm playing here. As things are right now, the light background below text like this one sits on an HTML5 article element, which dillo ignores. Hence, the text is black on dark green, which, well, may be barely readable but really is deeply sub-optimal. Since I consider the article element and its brethren real progress in terms of markup (the other positive “innovation” post Web-1.0), I will not change that markup just to make this render better in dillo. I may finally re-think the silly dark green background soon-ish, though.

    [1]If you feel like this, too, let's team up and massage luakit's front end to work with netsurf's rendering engine. Given the close entanglement of luakit with the webkitgtk API, this certainly will result in a very different program, and almost certainly there would be no way to re-use luakit extensions. Still, I could very well see such a thing become my main browser.
  • Javascript Local Storage

    Es gibt eine große Zahl gruseliger APIs (also: „Kram, den ein Programm tun kann“) in modernem Javascript, von Sensor APIs (zur Abfrage von Beschleunigung und Orientierung des Geräts und des Umgebungslichts) bis Websocket (mit dem weitgehend beliebige Server dauernd mit dem Client reden können, und das auch noch „im Hintergrund“ in Web Workers) – gruselig ist das, weil in aktuellen und nicht weiter modifizierten Browsern jede Webseite dieses Zeug nutzen kann, wobei eingestandenermaßen ein paar besonders dramatische APIs (Mikrofon und Kamera z.B.) noch durch Rückfragen abgesichert sind. Oh: Webseiten können das nicht nur nutzen, viel zu viele brauchen Javascript und gehen nicht, wenn mensch es abdreht.

    Der breite Zugriff auf fast alles von Javascript aus ist der Grund, warum ich des Öfteren ziemlich unwillig auf „please enable Javascript“-Banner (und noch mehr auf ihre Abwesenheit trotz Abhängigkeit von Javascript) reagiere. Javascript erlauben bedeutet spätestens seit XMLHTTPRequest (mit dem Javascript an der NutzerIn vorbei mit dem Ursprungsserver kommunizieren konnte; kam in der ersten Hälfte der Nullerjahre) und CORS (was XMLHTTPRequest und jetzt fetch auf beliebige Server ausweitete, solange die kooperieren; im Firefox tauchte das 2009 auf) einen recht weitgehenden Zugriff auf die eigene Maschine zu erlauben, und das noch ganz ohne Spectre und Freunde – die übrigens ohne ubiquitäres Javascript auf privaten Rechnern weitgehend bedeutungslos wären.

    Eine API, die ziemlich gruselig ist, weil sie Webseiten ein unendlich langes Gedächtnis in eurem Browser gibt, ist Local Storage; damit können Webseiten ernsthafte Datenmengen in eurem Browser speichern und sie wiederfinden, wenn sie das nächste Mal Javascript ausführen dürfen. Dass das zunächst lokal gespeichert wird, hilft nicht viel – per Websocket oder zur Not einem fetch mit Payload ist der Kram auch (wieder) auf jedem Server, der bereit ist, sich das anzuhören. Wohlgemerkt: ohne dass NutzerInnen irgendwas mitbekommen.

    Wenn ihr mein Gruseln nachfühlen wollt, könnt ihr hier mal Javascript einschalten (ihr lasst mich doch sonst kein Javascript ausführen, oder?). Ihr bekommt dann unter diesem Absatz ein Eingabefeld, in das ihr Kram tippen könnt. Wenn ihr einen hinreichend modernen Browser habt (technisch: er muss das storage-Signal schicken; Firefox 78 tut das z.B., Webkit 4.0.37 nicht), könnt ihr eure Nachricht in der Javascript-Warnung am Fuß der Seite sehen, schockierenderweise auch in anderen Fenstern, die ihr auf blog.tfiu.de offen habt. Auf allen halbwegs aktuellen Großbrowsern erscheint der Text jedenfalls nach dem nächsten Reload. Und bleibt auch da. Aufauf, schreibt eurem künftigen Selbst eine Nachricht in den Fuß dieser Seite:

    Gut. Du hast Javascript aus.

    Nennt mich paranoid, aber lustig ist das nicht.

    Und so ärgern mich noch viel mehr als Seiten, die Javascript fordern, Seiten, die ohne Local Storage leer bleiben oder sonst irgendwie undurchschaubar kaputt gehen.

    Was tun?

    Für die großen Browser gibt es allerlei Erweiterungen, die das Management von Local Storage erlauben; im Firefox sehe ich gerade Forget Me Not oder StoragErazor (was gleich noch die IndexedDB-API mit abdeckt, die ähnlich schrecklich ist, aber noch nicht viel genutzt wird); soweit ich erkennen kann, erlaubt das unverzichtbare noscript nicht, Javascript ohne Local Storage laufen zu lassen.

    Für meinen Haupt-Browser, den Luakit (verdammt, schon wieder ein Link von hier in github rein), habe ich eine kleine Erweiterung geschrieben, mit der ich mit der Tastenkombination ,tq schnell local storage ein- und ausschalten kann; auf die Weise kann ich normalerweise ohne Local Storage browsen, wenn dann aber mal wirklich eine Seite kaputt ist (gitlab ist da derzeit so ein ganz schlechtes Beispiel), braucht es nur ein paar Anschläge. Eine Anzeige in der Fußleiste (q/Q für Storage aus/an) kommt auch mit:

    -- Web storage control
    -- @module webstorage_control
    -- @copyright Public Domain
    local _M = {}
    local window = require("window")
    local theme = require("theme")
    local settings = require("settings")
    local modes = require("modes")
    local add_binds = modes.add_binds
    function update_webstorage_disp(w)
        if settings.get_setting("webview.enable_html5_database") then
            w.sbar.r.webstorage_d.text = "Q"
            w.sbar.r.webstorage_d.text = "q"
    function toggle_local_storage(w)
        local local_storage_enabled =
            not local_storage_enabled)
            not local_storage_enabled)
    window.add_signal("init", function (w)
        local r = w.sbar.r
        r.webstorage_d = widget{type="label"}
        r.layout:reorder(r.webstorage_d, 1)
        r.webstorage_d.font = theme.font
    add_binds("normal", {
        { "^,tq$", "Enable/disable web local storage", toggle_local_storage},
    return _M
    -- vim: et:sw=4:ts=8:sts=4:tw=80

    Wer auch den luakit verwendet, kann das nach .config/luakit/webstorage_control.lua packen und dann:


    an geeigneter Stelle (z.B. in .config/luakit/rc.lua) unterbringen. Wenn dermaleinst so viele Seiten ohne Local Storage kaputtgehen wie derzeit ohne Javascript, müsste das wahrscheinlich eher wie in der noscript-Erweiterung automatisiert werden.

    Auch wenn ich mal Local Storage erlaube, will ich natürlich nicht, dass der Kram persistent bleibt. Deshalb habe ich noch folgendes kleine Python-Programm geschrieben:

    Clear luakit web local storage and cookies
    There's a whitelist applied to both.
    import fnmatch
    import glob
    import os
    import re
    import sqlite3
    class Whitelist:
        """A fnmatch-based whitelist.
        Test as in "domain in whitelist".  It's not fast, though.
        def __init__(self,
            with open(src_path) as f:
                self.patterns = [s.strip() for s in f.read().split("\n")]
        def __contains__(self, domain):
            for pattern in self.patterns:
                if fnmatch.fnmatch(domain, pattern):
                    return True
            return False
    def clear_cookies(whitelist):
        """removes cookies from domains not in whitelist from luakit's
        cookies db.
        conn = sqlite3.connect(
            all_hosts = list(r[0]
                for r in conn.execute("select distinct host from moz_cookies"))
            for host in all_hosts:
                if host in whitelist:
                conn.execute("delete from moz_cookies where host=?",
    def try_unlink(f_name):
        """removes f_name if it exists.
        if os.path.exists(f_name):
    def clear_local_storage(whitelist):
        """removes luakit's local storage files unless their source
        domains are whitelisted for cookies.
        for f_name in glob.glob(os.path.expanduser(
            mat = re.match("https?_(.*?)_\d+.localstorage",
            if not mat:
            if mat.group(1) in whitelist:
    def main():
        whitelist = Whitelist()
    if __name__=="__main__":

    Das Programm liest eine Liste von Shell-Patterns (also etwas wie *.xkcd.com) aus einer Datei ~/.config/luakit/cookie.whitelist und löscht dann alle Cookies und Local Storage-Einträge im Luakit, die nicht von Servern kommen, die in dieser Ausnahmeliste erwähnt sind. Das Ganze läuft bei mir jeden Morgen aus meiner Crontab:

    01 09 * * * ~/mybin/clear_luakit_cookies.py

    Aber: Besser wärs, das wäre alles nicht da. In einem Browser, also etwas, mit dem mensch Webseiten anschauen will, hat eine API wie Local Storage mit Persistenz und Signalisierung eigentlich nichts verloren.

    Oh: Der Javascript-Quellcode für den ganzen Spaß mit euren Notizen in der Fußzeile ist erschreckend klein. Für den Fall, dass ich das mal anders schreibe und das so nicht mehr im Seiten-Quellcode zu finden sein wird, hier zunächst der Code für das Eingabefeld oben:

    <div id="textarea-container">
      <p><em>Gut.  Du hast Javascript aus.
    <script deferred="deferred">
      function setUp() {
        document.querySelector("#textarea-container").innerHTML =
          `<textarea id="to-store" style="width:100%; height:6cm"
          placeholder="Tippt kram, den ihr im Fuß der Seite sehen wollt."
        let textSource = document.querySelector("#to-store");
        if (!window.localStorage) {
          textSource.value = "Ah, gut.  Du hast local storage aus.  Dann\n"
            +"geht das hier nicht.  Wie gesagt, das ist gut.";
        if (window.localStorage.savedText) {
          textSource.value = window.localStorage.getItem("savedText");
        textSource.addEventListener("input", () => {
          window.localStorage.setItem("savedText", textSource.value);
        window.addEventListener("storage", (ev) => {
          textSource.value = ev.newValue;

    Und dann noch der fürs Management der Fußzeile (der sitzt jetzt gerade im head-Element):

    if (window.localStorage) {
      target.innerHTML +=
        `<div id="ls-warning"><p><strong>Schlimmer noch:</strong>
        Dein Browser lässt mich
        auch local storage machen.  Wenn du dir in
        <a href="/javascript-local-storage.html">diesem Artikel</a>
        eine Nachricht hinterlässt, bekommst du sie bei deinem nächsten
        Besuch unten zu sehen; du kannst da sogar dein eigenes
        HTML und Javascript unterbringen und ausführen lassen.</p>
        <p id="user-message"/></div>`;
      if (window.localStorage.savedText) {
        document.querySelector("#user-message").innerHTML =
      window.addEventListener("storage", (ev) => {
        document.querySelector("#user-message").innerHTML = ev.newValue;

Seite 1 / 1

Letzte Ergänzungen