veilor-os/test/boot-checklist.md
veilor-org 7d2b94b5be feat(hardening): add memory-pressure tuning for zram-only stack
veilor-os runs zram-only swap (THREAT-MODEL.md — no key leak from
disk swap). With kernel defaults that policy bites: once zram fills
there is no overflow tier, the kernel waits until total exhaustion
to trigger OOM, then picks a victim by oom_score and frequently
kills plasmashell or the foreground terminal instead of the leaking
browser tab. Mouse locks for minutes during the thrash window.

Three co-dependent layers:

1. systemd-oomd enabled — PSI-based pre-OOM killer fires at cgroup
   boundaries before the kernel reaper. Fedora's systemd-oomd-defaults
   ship sane thresholds for user.slice; installed in kickstart and
   layered in bluebuild containerfile, enabled in both unit-toggle
   blocks.

2. zram bumped 8 GiB lzo-rle (Fedora default) -> 16 GiB zstd. zstd
   gives ~3:1 (~48 GiB effective) at negligible CPU cost on any
   post-2018 x86_64. 8 GiB filled in practice on 32+ GiB laptops
   running Chromium + LSP + chat clients.

3. /etc/sysctl.d/95-memory-pressure.conf:
   - vm.swappiness=180 (zram is RAM-fast, swap early; default 60
     assumes HDD)
   - vm.watermark_scale_factor=125 (kswapd reclaim starts ~1.25%
     headroom vs default 0.1%; ~400 MiB head start on 32 GiB)
   - vm.page-cluster=0 (no read-ahead; pointless on RAM-backed swap,
     wastes decompress)

Without any one of the three the system still wedges briefly: oomd
without zram tuning waits for PSI to climb; zram tuning without oomd
gets victim selection wrong.

Verified by new test/boot-checklist.md "Memory pressure" section.
Inline rationale headers in both overlay files so the why survives
doc drift. Trigger event: onyx (Fedora 43, not veilor-os) thrashed
2026-05-11; same defaults shipped to veilor-os, fixed here too.
2026-05-12 10:17:00 +01:00

127 lines
4.6 KiB
Markdown

# Spare-laptop validation checklist
Run after installing a fresh veilor-os ISO. Each item should pass
before the build is considered green.
## Install flow
- [ ] Anaconda **only** prompts for LUKS passphrase — no account wizard,
no initial-setup screen
- [ ] Install completes without `%post` errors (check `/var/log/veilor-install.log`)
- [ ] Reboot succeeds, USB removed cleanly
## First boot
- [ ] LUKS prompt appears at boot
- [ ] TTY1 shows veilor-os banner + password prompt
- [ ] Password rejection on weak input (try `password123` — should fail)
- [ ] Password set succeeds with strong input
- [ ] SDDM starts after password set
- [ ] `admin@veilor-os` shell prompt visible after first login
- [ ] `veilor-firstboot.service` shows `inactive (dead)` and `disabled`
after first run
## Identity
- [ ] `passwd -S root` reports `L` (locked)
- [ ] `getent passwd | wc -l` shows base + admin only
- [ ] `id admin` shows `groups=...,wheel`
## Branding
- [ ] `hostnamectl` reports `veilor-os`
- [ ] `cat /etc/os-release` shows `NAME="veilor-os"` and `ID=veilor`
- [ ] `grep -ri onyx /etc /usr/local /usr/share/fonts` returns zero
- [ ] `grep -ri '192\.168\.0\.\|admin@gmail\|fedora\.local' /etc /usr/local` returns zero
## Theme
- [ ] KDE color scheme shows `veilor-black` in System Settings
- [ ] Konsole renders in DuckSans (`fc-match sans-serif` returns
`DuckSans` if the font was vendored)
- [ ] Background is pure black (#000000), not Breeze dark grey
## Power
- [ ] `veilor-power status` runs without sudo, shows current profile
- [ ] `veilor-power save` switches to `veilor-powersave`
- [ ] `veilor-power perf` switches to `veilor-performance`
- [ ] Unplugging AC auto-switches to `veilor-powersave` (udev rule)
- [ ] Plugging AC auto-switches to `veilor-performance`
## Hardening — services
- [ ] `systemctl is-active fail2ban` → active
- [ ] `systemctl is-active usbguard` → active
- [ ] `systemctl is-active auditd` → active
- [ ] `systemctl is-active firewalld` → active
- [ ] `systemctl is-active tuned` → active
- [ ] `systemctl is-active chronyd` → active
- [ ] `systemctl is-active sshd` → active
- [ ] `systemctl is-active cups` → inactive / not-found
- [ ] `systemctl is-active avahi-daemon` → inactive / not-found
- [ ] `systemctl is-active bluetooth` → inactive
- [ ] `systemctl is-active veilor-modules-lock` (after 30s) → active
## Hardening — kernel/sysctl
- [ ] `getenforce``Enforcing`
- [ ] `mokutil --sb-state``SecureBoot enabled`
- [ ] `sysctl kernel.yama.ptrace_scope``2`
- [ ] `sysctl kernel.kptr_restrict``2`
- [ ] `sysctl fs.suid_dumpable``0`
- [ ] `sysctl dev.tty.ldisc_autoload``0`
- [ ] `sysctl kernel.modules_disabled` (after 30s post graphical) → `1`
## Hardening — network
- [ ] `firewall-cmd --get-default-zone``drop`
- [ ] `firewall-cmd --zone=drop --list-services``ssh`
- [ ] `resolvectl status` shows DNSSEC + DoT, LLMNR off
- [ ] `chronyc sources -v` shows NTS-authenticated peers
## Hardening — SSH
- [ ] `sshd -T | grep -E 'permitrootlogin|passwordauth|allowusers|x11forwarding'`
shows: `permitrootlogin no`, `passwordauthentication no`,
`allowusers admin`, `x11forwarding no`
## Disk
- [ ] `lsblk -f` shows LUKS2 on the main partition
- [ ] `cryptsetup luksDump /dev/...` shows argon2id, aes-xts-plain64
- [ ] `swapon` shows `zram` device, no disk swap
- [ ] `zramctl` shows `ALGORITHM=zstd` and `DISKSIZE=16G` (= 16 GiB,
not Fedora's 8 GiB default — see `overlay/etc/systemd/zram-generator.conf`)
## Memory pressure
- [ ] `systemctl is-active systemd-oomd``active` (PSI-based pre-OOM
killer; without it the kernel waits until total RAM exhaustion
then often kills plasmashell or the active terminal instead of
the runaway tab)
- [ ] `sysctl vm.swappiness vm.watermark_scale_factor vm.page-cluster`
shows `180 / 125 / 0` (default `60 / 10 / 3` is wrong for
zram-only — kernel refuses to swap until exhausted, then thrashes)
## SELinux module
- [ ] `semodule -l | grep veilor-systemd` → present
- [ ] No SELinux denials in `ausearch -m AVC -ts boot` related to
`systemd_modules_load_t`
## USBGuard
- [ ] `systemctl status usbguard` → active
- [ ] `wc -l /etc/usbguard/rules.conf` → 0 (empty allowlist by design)
- [ ] After `sudo usbguard generate-policy > /etc/usbguard/rules.conf`
and restart, all currently-connected USB devices remain
functional
## Findings
Log issues and fixes here:
| Date | Item | Issue | Fix in kickstart? |
|------|------|-------|-------------------|
| | | | |