veilor-os runs zram-only swap (THREAT-MODEL.md — no key leak from
disk swap). With kernel defaults that policy bites: once zram fills
there is no overflow tier, the kernel waits until total exhaustion
to trigger OOM, then picks a victim by oom_score and frequently
kills plasmashell or the foreground terminal instead of the leaking
browser tab. Mouse locks for minutes during the thrash window.
Three co-dependent layers:
1. systemd-oomd enabled — PSI-based pre-OOM killer fires at cgroup
boundaries before the kernel reaper. Fedora's systemd-oomd-defaults
ship sane thresholds for user.slice; installed in kickstart and
layered in bluebuild containerfile, enabled in both unit-toggle
blocks.
2. zram bumped 8 GiB lzo-rle (Fedora default) -> 16 GiB zstd. zstd
gives ~3:1 (~48 GiB effective) at negligible CPU cost on any
post-2018 x86_64. 8 GiB filled in practice on 32+ GiB laptops
running Chromium + LSP + chat clients.
3. /etc/sysctl.d/95-memory-pressure.conf:
- vm.swappiness=180 (zram is RAM-fast, swap early; default 60
assumes HDD)
- vm.watermark_scale_factor=125 (kswapd reclaim starts ~1.25%
headroom vs default 0.1%; ~400 MiB head start on 32 GiB)
- vm.page-cluster=0 (no read-ahead; pointless on RAM-backed swap,
wastes decompress)
Without any one of the three the system still wedges briefly: oomd
without zram tuning waits for PSI to climb; zram tuning without oomd
gets victim selection wrong.
Verified by new test/boot-checklist.md "Memory pressure" section.
Inline rationale headers in both overlay files so the why survives
doc drift. Trigger event: onyx (Fedora 43, not veilor-os) thrashed
2026-05-11; same defaults shipped to veilor-os, fixed here too.
45 lines
2.2 KiB
Text
45 lines
2.2 KiB
Text
# veilor-os — memory-pressure tuning for zram-only swap
|
|
#
|
|
# Rationale: veilor-os ships zram swap with NO disk swap (see THREAT-MODEL.md
|
|
# §"Lost or stolen laptop"). The kernel's default vm.* knobs assume a slow
|
|
# spinning disk and refuse to swap until physical RAM is nearly exhausted.
|
|
# Under a zram-only stack that policy is wrong on two axes:
|
|
#
|
|
# 1. zram is RAM-fast — there is no penalty for swapping early, only a
|
|
# small CPU cost for zstd compress/decompress.
|
|
# 2. Once zram fills, there is no overflow (no disk swap by design), so
|
|
# the kernel falls through to OOM. With default knobs the OOM trigger
|
|
# is slow and reactive: by the time it fires, the system has spent
|
|
# minutes in thrash (compositor/input frozen, mouse stuck) and the
|
|
# kernel picks a victim by oom_score which is often plasmashell or
|
|
# the terminal — i.e. the user's session goes down, not the runaway.
|
|
#
|
|
# What these knobs do:
|
|
#
|
|
# vm.swappiness = 180
|
|
# Tell the kernel to prefer evicting anonymous pages to (zram) swap
|
|
# over reclaiming file-backed pages. Fedora's zram-generator upstream
|
|
# recommends 180 for zram-only systems. Default 60 is tuned for HDD
|
|
# swap and leaves zram unused until too late.
|
|
#
|
|
# vm.watermark_scale_factor = 125
|
|
# Start kswapd reclaim earlier (~1.25% of RAM headroom vs default
|
|
# 0.1%). On a 32 GiB box that's ~400 MiB head start before allocations
|
|
# would otherwise stall in direct-reclaim. Trades a tiny amount of
|
|
# usable RAM for much smoother latency under bursty allocators
|
|
# (Chromium/Electron tab spawns, language server warm-up).
|
|
#
|
|
# vm.page-cluster = 0
|
|
# Read one page per swap-in instead of the default 8. Read-ahead is a
|
|
# win on rotational media because seeks dominate; on zram the seek
|
|
# cost is zero and grabbing 7 extra pages just wastes decompress
|
|
# cycles and CPU cache. Setting to 0 is the documented zram tuning.
|
|
#
|
|
# Companion: systemd-oomd is enabled in the same change so PSI-based
|
|
# pre-OOM kills land on the right cgroup before the kernel OOM reaper
|
|
# fires. Without it, even with these knobs the system can still wedge
|
|
# briefly while the kernel waits for the global watermark.
|
|
|
|
vm.swappiness = 180
|
|
vm.watermark_scale_factor = 125
|
|
vm.page-cluster = 0
|