veilor-os/docs/HARDENING.md
veilor-org 7d2b94b5be feat(hardening): add memory-pressure tuning for zram-only stack
veilor-os runs zram-only swap (THREAT-MODEL.md — no key leak from
disk swap). With kernel defaults that policy bites: once zram fills
there is no overflow tier, the kernel waits until total exhaustion
to trigger OOM, then picks a victim by oom_score and frequently
kills plasmashell or the foreground terminal instead of the leaking
browser tab. Mouse locks for minutes during the thrash window.

Three co-dependent layers:

1. systemd-oomd enabled — PSI-based pre-OOM killer fires at cgroup
   boundaries before the kernel reaper. Fedora's systemd-oomd-defaults
   ship sane thresholds for user.slice; installed in kickstart and
   layered in bluebuild containerfile, enabled in both unit-toggle
   blocks.

2. zram bumped 8 GiB lzo-rle (Fedora default) -> 16 GiB zstd. zstd
   gives ~3:1 (~48 GiB effective) at negligible CPU cost on any
   post-2018 x86_64. 8 GiB filled in practice on 32+ GiB laptops
   running Chromium + LSP + chat clients.

3. /etc/sysctl.d/95-memory-pressure.conf:
   - vm.swappiness=180 (zram is RAM-fast, swap early; default 60
     assumes HDD)
   - vm.watermark_scale_factor=125 (kswapd reclaim starts ~1.25%
     headroom vs default 0.1%; ~400 MiB head start on 32 GiB)
   - vm.page-cluster=0 (no read-ahead; pointless on RAM-backed swap,
     wastes decompress)

Without any one of the three the system still wedges briefly: oomd
without zram tuning waits for PSI to climb; zram tuning without oomd
gets victim selection wrong.

Verified by new test/boot-checklist.md "Memory pressure" section.
Inline rationale headers in both overlay files so the why survives
doc drift. Trigger event: onyx (Fedora 43, not veilor-os) thrashed
2026-05-11; same defaults shipped to veilor-os, fixed here too.
2026-05-12 10:17:00 +01:00

9.5 KiB

Hardening Reference

What veilor-os locks down and why. Each item is applied by either the kickstart %post or the overlay tree shipped in /etc.

Boot chain

Item State Source
Secure Boot Required (bootloader signed) bootloader kickstart line
Kernel lockdown lockdown=integrity bootloader kernel args
Slab hardening slab_nomerge, init_on_alloc=1, init_on_free=1 bootloader
Stack offset randomize_kstack_offset=on bootloader
vsyscall vsyscall=none bootloader
LUKS2 aes-xts-plain64 / argon2id, mem=1GB, time=9 part pv.veilor
Module loading Locked 30s after graphical boot veilor-modules-lock.service

Kernel sysctl

/etc/sysctl.d/99-veilor-hardening.conf:

Key Value Why
kernel.kptr_restrict 2 hide kernel pointers from /proc
kernel.dmesg_restrict 1 dmesg root-only
kernel.yama.ptrace_scope 2 ptrace = parent only
kernel.perf_event_paranoid 3 unprivileged perf disabled
net.core.bpf_jit_harden 2 BPF JIT constant blinding
kernel.randomize_va_space 2 full ASLR
fs.suid_dumpable 0 no SUID core dumps
dev.tty.ldisc_autoload 0 block tty LPE vector
net.ipv4.tcp_syncookies 1 SYN flood mitigation
net.ipv4.conf.all.rp_filter 1 reverse-path filter
accept_source_route 0 (v4+v6) ignore source routing
accept_redirects 0 (v4+v6) ignore ICMP redirects

SELinux

  • Enforcing, targeted policy.
  • Custom module veilor-systemd grants systemd_modules_load_t the sys_admin and perfmon capabilities required by the modules-lock service. Source: scripts/selinux/veilor-systemd.te.

veilor-firstboot SELinux confinement

The first-boot password service is privileged (it has to write /etc/shadow) but small. Module veilor-firstboot carves a tight domain:

  • Allowed: read /etc/passwd, exec passwd(1), write /var/lib/veilor-firstboot.done, write /etc/sddm.conf.d/, start sddm.service.
  • neverallow rules block: network sockets (no phone-home), home_root_t / user_home_t access, sys_module, sys_ptrace, sys_rawio.

Source: scripts/selinux/veilor-firstboot.te. Build & load with scripts/selinux/build-policy.sh (loads all modules in one pass).

Network surface

  • firewalld default zone = drop.
  • Inbound: ssh only.
  • systemd-resolved: LLMNR off, DNSSEC allow-downgrade, DNS-over-TLS opportunistic. Resolvers: Cloudflare (1.1.1.1, 1.0.0.1), fallback Quad9 (9.9.9.9, 149.112.112.112).
  • chrony: NTS-authenticated time from time.cloudflare.com and nts.sth1/2.ntp.se. Pool fallback only.

SSH

/etc/ssh/sshd_config.d/10-veilor-hardening.conf:

  • PasswordAuthentication no
  • PermitRootLogin no
  • AllowUsers admin
  • X11Forwarding no
  • MaxAuthTries 3
  • ClientAliveInterval 300
  • LogLevel VERBOSE

Auth / accounts

  • Root account locked (passwd -l root). No interactive root login.
  • Single admin user, wheel group, full sudo.
  • pwquality.conf: minlen=14, 4 character classes required, dictcheck.
  • First-boot password flow: chage -d 0 admin expires the empty password immediately. veilor-firstboot.service runs on TTY1 before SDDM, prompts for new password, then starts the graphical session.

Audit

/etc/audit/rules.d/99-veilor-hardening.rules watches:

  • /etc/passwd, /etc/shadow, /etc/group, /etc/gshadow
  • /etc/sudoers, /etc/sudoers.d/
  • /etc/ssh/sshd_config*, /etc/selinux/, /etc/firewalld/
  • /etc/cron.*, /var/spool/cron/
  • /etc/sysctl.*, /etc/systemd/system/, /usr/lib/systemd/system/
  • All privileged binaries (sudo, su, passwd, mount, pkexec, etc.)
  • Kernel module load/unload syscalls
  • Permission/ownership changes by uid≥1000

Intrusion detection

fail2ban jails:

  • sshd — aggressive mode, 3 retries, 24h ban
  • pam-generic — 5 retries, 1h ban (catches XDM, su, sudo failures)

Backend: systemd journal. Action: firewalld rich rules.

USB

USBGuard daemon, ImplicitPolicyTarget=block.

Ships with empty allowlist. On first boot, admin runs:

sudo usbguard generate-policy > /etc/usbguard/rules.conf
sudo systemctl restart usbguard

This snapshots all currently-connected devices into the allowlist. Anything plugged in afterward is blocked unless explicitly allowed:

sudo usbguard list-devices
sudo usbguard allow-device <id>

Disabled services

abrt*, cups, cups-browsed, geoclue, avahi-daemon, bluetooth, ModemManager, gssproxy, atd, pcscd.socket, pcscd.service, kdeconnectd (removed at package level).

AppArmor (v0.5)

Fedora 43 ships AppArmor alongside SELinux. veilor-os keeps SELinux as the primary MAC layer (enforcing, targeted) but ships AppArmor profile skeletons for high-risk userland binaries that benefit from a second, binary-scoped policy on top of SELinux's role-based one.

Profiles live in scripts/apparmor/:

Profile Target Default mode
usr.bin.thorium Thorium browser complain
usr.local.bin.lm-studio LM Studio LLM runner complain
usr.bin.veilor-power Power profile switcher enforce

Profiles are not loaded automatically — they are opt-in until v0.5. Enable a profile post-install with:

sudo dnf install apparmor-utils apparmor-parser
sudo install -m 0644 scripts/apparmor/usr.bin.thorium /etc/apparmor.d/
sudo apparmor_parser -r /etc/apparmor.d/usr.bin.thorium
sudo aa-complain /etc/apparmor.d/usr.bin.thorium      # log only
sudo aa-enforce  /etc/apparmor.d/usr.bin.thorium      # block

Refine complain-mode profiles with aa-logprof after exercising the app through normal use; it converts logged denials into rule additions interactively.

Audit log shipping (optional)

Local journald is the default audit sink. For off-device shipping to a trusted log collector (Loki / Wazuh / Splunk), veilor-os ships a disabled-by-default plugin template:

  • /etc/audit/plugins.d/veilor-remote.conf — auditd plugin shim (set active = yes to enable).
  • /etc/audisp/audisp-remote.conf.disabled — audisp-remote target config template (rename to audisp-remote.conf and edit remote_server to enable).

Warning: enabling remote audit shipping leaks every privileged syscall, file-watch hit, and auth event off-device. Treat the collector as a host with the same trust level as root. Only enable if the collector itself is hardened and the transport is TLS or kerberized.

Reference integration paths in the template: Loki via promtail/vector syslog source, Wazuh via local wazuh-agent (no network shipping needed), Splunk via HEC bridge.

What's not enabled by default

  • Disk swap — replaced by zram (RAM-only, no key leak risk).

Memory pressure

veilor-os runs zram-only swap (see THREAT-MODEL.md — keeps cleartext session keys out of any persistent allocation that would survive suspend-to-disk or a yanked drive). That stance has a sharp edge: once zram fills, there is no overflow tier. With stock kernel defaults the result is a multi-minute thrash — input compositor frozen, mouse stuck, keyboard ignored — followed by a kernel OOM kill that picks the wrong victim (often plasmashell or the foreground terminal) because the runaway browser tab has a lower oom_score than the long-lived session process. The user's desktop dies; the leaking app survives.

Three layers of mitigation ship by default:

Layer File What it does Failure mode if absent
systemd-oomd enabled in kickstart/veilor-os.ks %post and in bluebuild/recipe.yml unit-toggle RUN PSI-based pre-OOM killer — picks the cgroup under highest memory+IO pressure and terminates it before the kernel's global reaper fires. Reads from /proc/pressure/*, kills at the cgroup boundary so siblings survive. Kernel waits until total exhaustion. Picks by oom_score → plasmashell / terminal die, browser tab keeps leaking. Mouse locks during the wait.
zram-generator override overlay/etc/systemd/zram-generator.conf (and matching %post write) 16 GiB compressed with zstd (~3:1 → ~48 GiB effective). Replaces Fedora default 8 GiB / lzo-rle. 8 GiB fills under sustained pressure on 32+ GiB laptops running Chromium + LSP + chat. No overflow (no disk swap) → straight to OOM.
vm. sysctl* overlay/etc/sysctl.d/95-memory-pressure.conf swappiness=180 (use zram early — it's RAM-fast), watermark_scale_factor=125 (kswapd starts reclaim ~1.25 % headroom vs default 0.1 %), page-cluster=0 (no read-ahead — pointless on RAM-backed swap, wastes decompress cycles). Defaults 60 / 10 / 3 assume slow HDD swap. Kernel refuses to swap until allocations stall in direct-reclaim → thrash window before either oomd or kernel OOM acts.

All three are co-dependent: oomd without zram tuning still wedges briefly waiting for PSI to climb; zram tuning without oomd still gets kernel-OOM victim selection wrong. Verified by test/boot-checklist.md "Memory pressure" section.

Layer rationale logged in overlay/etc/sysctl.d/95-memory-pressure.conf and overlay/etc/systemd/zram-generator.conf headers — kept inline so the why survives even if this doc is deleted.

  • Bluetooth — disabled. Enable with systemctl enable --now bluetooth.
  • Printing — CUPS removed. Reinstall if needed: dnf install cups.
  • Snapd, Flatpak — not installed (Flatpak optional add-on).
  • PackageKit — removed; updates manual via dnf.