veilor-os/test
veilor-org 7d2b94b5be feat(hardening): add memory-pressure tuning for zram-only stack
veilor-os runs zram-only swap (THREAT-MODEL.md — no key leak from
disk swap). With kernel defaults that policy bites: once zram fills
there is no overflow tier, the kernel waits until total exhaustion
to trigger OOM, then picks a victim by oom_score and frequently
kills plasmashell or the foreground terminal instead of the leaking
browser tab. Mouse locks for minutes during the thrash window.

Three co-dependent layers:

1. systemd-oomd enabled — PSI-based pre-OOM killer fires at cgroup
   boundaries before the kernel reaper. Fedora's systemd-oomd-defaults
   ship sane thresholds for user.slice; installed in kickstart and
   layered in bluebuild containerfile, enabled in both unit-toggle
   blocks.

2. zram bumped 8 GiB lzo-rle (Fedora default) -> 16 GiB zstd. zstd
   gives ~3:1 (~48 GiB effective) at negligible CPU cost on any
   post-2018 x86_64. 8 GiB filled in practice on 32+ GiB laptops
   running Chromium + LSP + chat clients.

3. /etc/sysctl.d/95-memory-pressure.conf:
   - vm.swappiness=180 (zram is RAM-fast, swap early; default 60
     assumes HDD)
   - vm.watermark_scale_factor=125 (kswapd reclaim starts ~1.25%
     headroom vs default 0.1%; ~400 MiB head start on 32 GiB)
   - vm.page-cluster=0 (no read-ahead; pointless on RAM-backed swap,
     wastes decompress)

Without any one of the three the system still wedges briefly: oomd
without zram tuning waits for PSI to climb; zram tuning without oomd
gets victim selection wrong.

Verified by new test/boot-checklist.md "Memory pressure" section.
Inline rationale headers in both overlay files so the why survives
doc drift. Trigger event: onyx (Fedora 43, not veilor-os) thrashed
2026-05-11; same defaults shipped to veilor-os, fixed here too.
2026-05-12 10:17:00 +01:00
..
test-runs docs: test run report skeleton for v0.5.32 (Forgejo build) 2026-05-06 16:10:03 +01:00
auto-install-keymap.sh v0.5.5: autonomous install test harness (#12) 2026-05-02 22:49:51 +01:00
auto-install.sh test/auto-install.sh: auto-fetch + reassemble chunked ISO from ci-latest 2026-05-02 22:50:37 +01:00
boot-checklist.md feat(hardening): add memory-pressure tuning for zram-only stack 2026-05-12 10:17:00 +01:00
METHOD-CHANGELOG.md docs: METHOD-CHANGELOG 2026-05-06 forgejo entry 2026-05-06 16:10:03 +01:00
README.md v0.5.5: autonomous install test harness (#12) 2026-05-02 22:49:51 +01:00
run-vm.sh v0.5.32: ship 7 blockers from 9-agent wave 2026-05-06 16:10:03 +01:00
TESTING.md v0.5.27: rd.luks.uuid via grubby, GRUB rebrand, fbcon=nodefer, ASCII gum cursor 2026-05-05 01:43:00 +01:00

test/

Test harnesses for veilor-os ISO builds.

Files

File Purpose
run-vm.sh Manual smoke test — boot the latest ISO interactively in QEMU/KVM. SSH key injection via cloud-init seed + monitor sendkey fallback for live-image login.
auto-install.sh Autonomous end-to-end install test. Boots ISO, drives the gum installer via QEMU monitor sendkey, waits for anaconda to finish + reboot, SSHs into the installed system, runs validation checklist. Prints PASS/FAIL summary.
auto-install-keymap.sh Sourced helper. Provides km_send_str, km_send_chord, km_send_key, km_screendump, km_wait_socket, etc. Reusable by other automation.
boot-checklist.md Manual post-install checklist (run on a real spare laptop).

Running the autonomous installer test

./test/auto-install.sh build/out/veilor-os-*.iso

Hardcoded inputs (deterministic — do not edit during a test run):

  • Disk: first /dev/vda (the only disk in QEMU)
  • Hostname: veilor (installer hardcoded since v0.5.4)
  • LUKS passphrase: testpass1234
  • Admin password: adminpass1234
  • Locale: en_GB.UTF-8

Expected runtime: 2030 minutes wall clock (anaconda dominates).

Outputs

  • /tmp/veilor-auto-install.log — full driver log
  • /tmp/veilor-auto-install-NN-<step>.png — milestone screenshots
  • /tmp/veilor-auto-install-final-ssh.txt — final SSH session capture (uname/lsblk/cmdline/failed units)

Exit codes

  • 0 — all validation checks passed
  • 1 — any failure (anaconda crashed, SSH never came up, validation check failed)
  • 2 — preflight failure (missing tool, bad ISO arg, missing OVMF)

Prerequisites

  • qemu-system-x86_64, qemu-img, socat, ssh, ssh-keygen
  • edk2-ovmf (OVMF UEFI firmware at /usr/share/edk2/ovmf/OVMF_{CODE,VARS}.fd)
  • mkisofs or xorriso (for cloud-init seed ISO; harness falls back to TTY1 driving if seed cannot be built or cloud-init does not run on the installed system)
  • convert from ImageMagick (optional — converts PPM screendumps to PNG; harness keeps PPM if absent)
  • KVM access (/dev/kvm readable by the user)

What it validates

Post-install on the booted system:

  • /etc/os-releaseNAME=veilor-os
  • hostnamectl --staticveilor
  • systemctl is-activeactive for sshd fail2ban usbguard tuned auditd firewalld chronyd sddm
  • getenforceEnforcing (preferred) or Permissive (acceptable for v0.5.x)
  • lsblk -f shows crypto_LUKS + btrfs
  • /etc/crypttab has a LUKS entry
  • getent passwd admin returns the user
  • /usr/local/bin/{veilor-power,veilor-doctor,veilor-update} are present and executable
  • /proc/cmdline contains init_on_alloc=1

Troubleshooting

  • Stuck at boot banner: ISO didn't autostart veilor-installer on tty1. Check serial.log and auto-install-vm-NN-*.png screenshots. The harness aborts after 5 minutes of identical screen frames.
  • SSH never up: cloud-init may not have run on the installed system (no cidata mount). The harness falls back to TTY1 driving — typing the LUKS passphrase, logging in as admin, and hand-injecting the SSH key. If both paths fail, validation cannot proceed.
  • screendump produces unreadable PPM: install ImageMagick (dnf install ImageMagick) so the harness converts to PNG.