How to Disable pdf.js in Webkit on Debian

A window of the zathura PDF viewer showing the GDPR.

This is how I want my PDFs rendered. And I want a j to scroll down a bit. That pdf.js fails on both accounts is just the first two of its defects.

When I upgraded to Debian bookworm, I noticed with great dismay that the webkit browser engine it comes with has a pdf.js-based PDF renderer built in.

That means that my preferred browser, luakit, is basically broken when dealing with PDFs: where I disable Javascript (i.e., by default), I see nothing at all. Where I allow Javascript, my PDFs appear in a UI I consider rather nasty. On top of that, I lose the nice archive of PDFs I've recently read that came with luakit's viewpdf extension. That holds true even if I do manage to properly open the PDF in my preferred renderer (zathura) using pdf.js's Save, as that blindly calls all PDFs “document.pdf”.

Regrettably, there doesn't seem to be a runtime switch to turn off the in-browser PDF rendering. After poking around a bit in webkit's source code, I have convinced myself that I won't add that switch myself. I am just not desperate enough to start hacking on one of the major browser engines.

But there is a build-time switch to turn pdf.js off. I have always shied away from building my own webkit packages because there's so horribly much code and C++ compilers are so terribly resource-hungry. But my suffering with the pdf.js disaster has reached a level that made me overcome that horror. So, here's how to build a Webkit such that browsers based on it will again handle PDFs properly (sc. by handing them over to the system). All this is for Debian bookworm and derivatives; let's hope it won't be necessary beyond that.

  1. Get the source:

    mkdir -p src/webkit
    cd src/webkit
    apt-get source webkit2gtk
    cd webkit2gtk*
    

    This will only work if you have configured a source repo for your suite in your /etc/apt/sources.list (or equivalent) and run apt update after that.

    This is pulls in about 50 Megabytes, which in itself is an argument in favour of netsurf. But these 50 Megs are peanuts compared to what's coming: by the time you've done a full build, this directory will have exploded into more than 3 GB (in i386). Let's fix the web so plain browsing doesn't require such monsters.

  2. Configure your build. Fortunately, you mostly only touch the debian/rules file. In there, change:

    ENABLE_SOUP2=YES
    ENABLE_SOUP3=YES
    ENABLE_GTK4=YES
    

    to (presumably):

    ENABLE_SOUP2=YES
    ENABLE_SOUP3=NO
    ENABLE_GTK4=NO
    

    That's for luakit that is built on top of soup2; if your browser uses a different API, make a different choice here. Each build takes forever and gobbles up about 3 Gigs in the process, so be stingy here.

    Then, locate the line -DENABLE_MINIBROWSER=ON (which currently concludes the EXTRA_CMAKE_ARGUMENTS) and change it to:

    -DENABLE_MINIBROWSER=ON \
    -DENABLE_PDFJS=OFF \
    -DENABLE_JOURNALD_LOG=OFF
    

    Disabling the journald log is not strictly necessary, but it helps building on non-systemd boxes, and I doubt it actually hurts anyone.

    Nachtrag (2024-01-21)

    At least with 2.40.3, this procedure ends in a:

    dh_install: error: missing files, aborting
    

    presumably because we are not building for two APIs. I think that's a bug, but from dh_install's manpage I cannot even understand why it thinks it should fail because of missing files, and consequently futzing around with debian/not-installed or the various options went nowhere. Because I'm really grumpy with the whole state of affairs, I quickly resigned into simply emptying all debian/*.install files not pertinent to the packages I want to build.

  3. Remove the systemd build dependency. We can do that because we have just disabled the JOURNALD_LOG. So, in debian/control, delete the line:

    libsystemd-dev [linux-any],
    
  4. Install the build dependencies:

    sudo apt-get build-dep webkit2gtk
    

    On non-systemd boxes, this will say something like:

    libelogind0 : Conflicts: libsystemd0
    

    because you have not removed the libsystemd dependency from apt's database in step (3), and webkit at this point doesn't know it could build with libelogind0-dev, too. Don't worry about it as long as all the other build-dependencies came in.

  5. Make a changelog entry so your system knows your build is “newer” than Debian's and you can later tell it's your custom build:

    dch -i
    

    You probably want to put something like “rebuild with PDFJS disabled“ in there, but that's exclusively for your own comfort unless you start distributing your package.

  6. Do the build:

    dpkg-buildpackage -j6 -b -uc -us -rfakeroot
    

    Do that on a cold day, because this will turn your machine into a space heater for several hours (unless you have a very fast machine, in which case you probably don't need another space heater in the first place).

  7. When this is done, you will have about a dozen binary packages in the build directory's parent. You probably don't want to dpkg -i *.deb, as there's no point installing debug packages (for starters). For luakit, I've run this:

    sudo dpkg -i gir1.2-javascriptcoregtk-4.0_2.*.deb gir1.2-webkit2-4.0_2.*.deb libjavascriptcoregtk-4.0-18_2.*.deb libjavascriptcoregtk-4.0-bin_2.*.deb libjavascriptcoregtk-4.0-dev_2.*.deb libwebkit2gtk-4.0-37_2.*.deb libwebkit2gtk-4.0-dev_2.*.deb
    

    This could be a slight over-installation.

By the way, in case the build fails somewhere in the middle but is fundamentally sane, you can resume it by calling:

fakreroot debian/rules binary

Doing dpkg-buildpackage as above resets the build and will discard everything the computer has built in perhaps hours.

Given the extreme cost of building a webkit, getting pdf.js out in this way is not a long-term plan, at least if you want your webkit to be halfway up-to-date (which is a good idea in particular if you're indiscriminate as to who can execute Javascript in your browser). Until someone kindly implants a run-time switch, I'm going to shut out pdfjs-infested upgrades until some really, really unnerving (that is, even more unnerving than usual) webkit vulnerability surfaces. To do that, I'm dropping:

# my webkit with patched-out pdfjs
Package: libjavascriptcoregtk-4.0-18
Pin: version 2.40.5-1~deb12u1.1
Pin-Priority: 1001

into /etc/apt/preferences.d/10pins (where your version will probably different; check the version tag in the names of the generated package files). That will make the messages from apt upgrade quite a bit uglier, and of course I'll have a webkit with published security bugs (you have been warned in case you're doing as I do). But in my book that's totally worth it just to get rid of the wretched pdf.js.

Kategorie: edv

Letzte Ergänzungen