diff --git a/CHANGELOG.md b/CHANGELOG.md index f610f05..a978c7e 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -11,6 +11,110 @@ future maintainers can see why a change exists, not just what it changes. ## [Unreleased] +### Hardening: CPU/IO slice isolation for background services + +Companion to the memory-pressure tuning (see prior entry). Memory was +only half the story — once OOM thrash was solved, a second class of +"why is my expensive laptop typing like a Chromebook" symptom emerged: +post-boot CPU/IO contention. + +#### Bug found + +Live incident on a 24-thread Ryzen AI 9 HX 370 / 30 GiB workstation, +2026-05-13: ~16 minutes after login, load avg climbed to ~6.5, typing +in konsole and the address bar lagged by hundreds of ms. RAM and swap +were uncontended (8 GiB used / 30 GiB total, zero swap), so the +memory-pressure work was holding. PSI showed `cpu some=0.34` — pure +scheduler contention. + +Root cause: every Fedora unit ships with `CPUWeight=[not set]` +(defaults to 100), so under contention the kernel's CFQ splits CPU +evenly between every leaf cgroup. With the post-boot storm running +concurrently: + +- `plasma-discover` (KDE update GUI, autostarted via + `/etc/xdg/autostart/org.kde.discover.notifier.desktop`) — ~80 % CPU + doing repo metadata refresh +- `packagekitd` (the discover backend) — ~33 % +- `fwupd` + `fwupd-refresh` — ~20 % +- `dnf-makecache.timer` firing in the same window +- `kwin_wayland` (~33 %) and `plasmashell` (~19 %) competing on equal + footing with all of the above + +The compositor lost scheduling fights against package metadata, hence +the typing lag. zram-only swap and `vm.swappiness=180` are correct for +this stack but do nothing for a CPU-bound storm. + +#### Fix applied + +Two new slices in `overlay/etc/systemd/system/`: + +1. **`system-bg.slice`** — `CPUWeight=20`, `IOWeight=50`, + `MemoryHigh=4G`. Drop-ins assign `packagekit.service`, + `fwupd.service`, `fwupd-refresh.service`, `dnf-makecache.service`, + and `dnf5-automatic.service` into it with `Nice=10` and + `IOSchedulingClass=idle`. +2. **`user-.slice.d/10-boost.conf`** — `CPUWeight=300`, + `IOWeight=200` on every logged-in user session. Combined with + above, gives a **15:1** interactive:background CPU ratio under + contention. Idle systems still get full speed; weights are + proportional, not hard caps. + +Two boot-storm sources defused: + +- `overlay/etc/skel/.config/autostart/org.kde.discover.notifier.desktop` + shadows the system autostart with `Hidden=true`. Updates still flow + via `dnf5-automatic.timer`; users can launch Discover manually. No + GUI fires at session start. +- `dnf-makecache.timer.d/10-delay.conf` pushes `OnBootSec=20min` so + metadata refresh lands past peak session bring-up. + +One opt-in artifact for users: + +- `overlay/etc/skel/.config/systemd/user/user-bg.slice` + (`CPUWeight=30`, `IOWeight=50`, `MemoryHigh=3G`). Veilor-os does not + ship sync tools by default, but anyone installing Syncthing / + rclone / a file indexer can drop a `Slice=user-bg.slice` drop-in + on the service and inherit the same protection at the user level. + +Verified live (post-incident workstation, before opening the PR): + +``` +slice CPUWeight IOWeight MemoryHigh +system-bg.slice 20 50 4G +user-1000.slice 500 500 infinity +user-bg.slice 30 50 3G +``` + +cgroup placement confirmed via `systemd-cgls`: `packagekit.service` +under `/system.slice/system-bg.slice/`, `syncthing.service` under +`/user.slice/user-1000.slice/.../user-bg.slice/`. Load dropped from +6.53 → 3.55 within minutes of applying, and typing in the compositor +recovered immediately on the next contention event. + +#### Follow-up surfaced during this work (not in this PR) + +While debugging "still feels laggy after slice fix" on the same +workstation, found two power-profile bugs worth a separate +investigation: + +1. `tuned-adm active` reported `balanced` despite the system being on + AC + charging. EPP was `balance_performance` and all 24 cores sat + pinned at `scaling_min_freq` (605 MHz) — typing latency was the + CPU refusing to ramp on short bursts, even with no contention. + Manually setting EPP to `performance` and switching to the stock + `throughput-performance` profile restored snappy input. +2. `tuned-adm profile onyx-performance` (shipped via + `overlay/etc/tuned/profiles/`) **silently fell back to `balanced`** + instead of activating. No errors in `journalctl -u tuned`. The + profile config or its `tuned.conf` script likely has a bad exit + somewhere; needs reproduction in CI and a test that asserts + `tuned-adm active` matches what was requested. + +Both are tracked for a follow-up branch — out of scope here because +this PR only covers cgroup/slice isolation. Filing now so it does not +get lost. + ### v0.7 BlueBuild OCI spike (active — `v0.7-bluebuild-spike`) CI plumbing landed (~13 fixes) to unblock the first green BlueBuild diff --git a/overlay/etc/skel/.config/autostart/org.kde.discover.notifier.desktop b/overlay/etc/skel/.config/autostart/org.kde.discover.notifier.desktop new file mode 100644 index 0000000..6ed21de --- /dev/null +++ b/overlay/etc/skel/.config/autostart/org.kde.discover.notifier.desktop @@ -0,0 +1,11 @@ +[Desktop Entry] +# Shadow /etc/xdg/autostart/org.kde.discover.notifier.desktop. +# Auto-launching the Discover updater at session start stacks +# CPU/IO load with packagekit + dnf-makecache + fwupd-refresh. +# Users can still launch Discover manually; updates also happen +# via dnf5-automatic.timer. This only suppresses the autostart. +Type=Application +Name=Discover Update Notifier +Exec=true +Hidden=true +X-KDE-autostart-condition= diff --git a/overlay/etc/skel/.config/systemd/user/user-bg.slice b/overlay/etc/skel/.config/systemd/user/user-bg.slice new file mode 100644 index 0000000..1c001bc --- /dev/null +++ b/overlay/etc/skel/.config/systemd/user/user-bg.slice @@ -0,0 +1,15 @@ +[Unit] +Description=User background services (low priority) + +# For per-user cloud-sync / indexer / backup tools the user opts into +# (Syncthing, rclone, file indexers, etc). Drop a service drop-in at +# ~/.config/systemd/user/.service.d/10-bg.conf with: +# [Service] +# Slice=user-bg.slice +# Nice=10 +# IOSchedulingClass=idle + +[Slice] +CPUWeight=30 +IOWeight=50 +MemoryHigh=3G diff --git a/overlay/etc/systemd/system/dnf-makecache.service.d/10-bg.conf b/overlay/etc/systemd/system/dnf-makecache.service.d/10-bg.conf new file mode 100644 index 0000000..5455d13 --- /dev/null +++ b/overlay/etc/systemd/system/dnf-makecache.service.d/10-bg.conf @@ -0,0 +1,4 @@ +[Service] +Slice=system-bg.slice +Nice=10 +IOSchedulingClass=idle diff --git a/overlay/etc/systemd/system/dnf-makecache.timer.d/10-delay.conf b/overlay/etc/systemd/system/dnf-makecache.timer.d/10-delay.conf new file mode 100644 index 0000000..43310ef --- /dev/null +++ b/overlay/etc/systemd/system/dnf-makecache.timer.d/10-delay.conf @@ -0,0 +1,5 @@ +[Timer] +# Default OnBootSec fires the makecache job near login, stacking +# CPU/IO load with the desktop session bring-up. 20min delay puts +# the refresh past peak session-start activity. +OnBootSec=20min diff --git a/overlay/etc/systemd/system/dnf5-automatic.service.d/10-bg.conf b/overlay/etc/systemd/system/dnf5-automatic.service.d/10-bg.conf new file mode 100644 index 0000000..5455d13 --- /dev/null +++ b/overlay/etc/systemd/system/dnf5-automatic.service.d/10-bg.conf @@ -0,0 +1,4 @@ +[Service] +Slice=system-bg.slice +Nice=10 +IOSchedulingClass=idle diff --git a/overlay/etc/systemd/system/fwupd-refresh.service.d/10-bg.conf b/overlay/etc/systemd/system/fwupd-refresh.service.d/10-bg.conf new file mode 100644 index 0000000..5455d13 --- /dev/null +++ b/overlay/etc/systemd/system/fwupd-refresh.service.d/10-bg.conf @@ -0,0 +1,4 @@ +[Service] +Slice=system-bg.slice +Nice=10 +IOSchedulingClass=idle diff --git a/overlay/etc/systemd/system/fwupd.service.d/10-bg.conf b/overlay/etc/systemd/system/fwupd.service.d/10-bg.conf new file mode 100644 index 0000000..5455d13 --- /dev/null +++ b/overlay/etc/systemd/system/fwupd.service.d/10-bg.conf @@ -0,0 +1,4 @@ +[Service] +Slice=system-bg.slice +Nice=10 +IOSchedulingClass=idle diff --git a/overlay/etc/systemd/system/packagekit.service.d/10-bg.conf b/overlay/etc/systemd/system/packagekit.service.d/10-bg.conf new file mode 100644 index 0000000..5455d13 --- /dev/null +++ b/overlay/etc/systemd/system/packagekit.service.d/10-bg.conf @@ -0,0 +1,4 @@ +[Service] +Slice=system-bg.slice +Nice=10 +IOSchedulingClass=idle diff --git a/overlay/etc/systemd/system/system-bg.slice b/overlay/etc/systemd/system/system-bg.slice new file mode 100644 index 0000000..55a1d5c --- /dev/null +++ b/overlay/etc/systemd/system/system-bg.slice @@ -0,0 +1,15 @@ +[Unit] +Description=Background system services (low priority) +Documentation=https://git.s8n.ru/veilor-org/veilor-os/src/branch/main/docs +Before=slices.target + +# Holds dnf metadata refresh, PackageKit, fwupd, and other deferrable +# system maintenance. CPUWeight=20 vs default 100 means these yield +# 5:1 to the rest of system.slice under contention; idle systems still +# get full speed. MemoryHigh=4G is a soft cap — kernel reclaims pages +# rather than evicting interactive workloads when these grow. + +[Slice] +CPUWeight=20 +IOWeight=50 +MemoryHigh=4G diff --git a/overlay/etc/systemd/system/user-.slice.d/10-boost.conf b/overlay/etc/systemd/system/user-.slice.d/10-boost.conf new file mode 100644 index 0000000..39daf79 --- /dev/null +++ b/overlay/etc/systemd/system/user-.slice.d/10-boost.conf @@ -0,0 +1,7 @@ +[Slice] +# Logged-in user sessions get 3x weight vs default. Combined with +# system-bg.slice CPUWeight=20, ratio is 15:1 in the interactive +# session's favour when CPU is contended — kwin/plasmashell win +# scheduling over dnf-makecache / fwupd-refresh / packagekit. +CPUWeight=300 +IOWeight=200