7 docs in /testing/ — institutional memory after 6+ regressions in
24-48h on the v6 theme. Read before any edit.
README.md — index + quickstart
THEMING.md — safe-edit checklist + layer/specificity tables
ERROR-PATTERNS.md — 12 cataloged patterns (Symptom/Cause/Diag/Fix/Prev)
HEADLESS-PROBE.md — 11 playwright recipes (md5 chain, darkPct,
ancestor bg sample, dropdown listItem probe)
ROLLBACK.md — 8 emergency revert recipes (overlay, branding,
encoding, full-from-repo, dev-clone-prod,
git-revert, pw-reset, bind-mount inode-swap)
SMOKE-TEST.md — manual + headless verify checklist
DEPLOY.md — dev → prod promotion workflow with backup +
chown root + restart inode-swap
Empty subdirs: snipUSER-Es/, recipes/, incidents/ (post-mortems land here).
Goal: stop reinventing the same fixes. Catalog every error class,
codify the recovery, build a skills folder for future ARRFLIX work.
8.2 KiB
ROLLBACK — emergency revert procedures
When something breaks, follow these recipes. Each is one shell block + verify step.
When to use this
Any of:
- Live users report black screens, missing UI, can't login
- Headless probe shows
darkPct > 50%during playback /Branding/Css.cssreturns 0 bytes- Container won't start or stays unhealthy
curl -s https://arrflix.s8n.ru/web/index.html | grep -c ARRFLIX-MIDDLE-THEME-BEGINreturns 0- Owner says "rollback" — don't debate, restore first, diagnose after
ROLLBACK 1 — overlay (most common)
Symptom: theme regression, wrong colors, missing features after a deploy. Login or home page renders but looks wrong.
Source: every prod deploy creates a .bak.pre-<reason>.<unix-ts> file in /opt/docker/jellyfin/web-overrides/. Pick the most recent (or the one matching the last-known-good state — e.g. index.html.bak.pre-favfix.1778318089 is pre-v6+favfix).
ssh user@nullstone 'set -e
TS=$(ls /opt/docker/jellyfin/web-overrides/index.html.bak.* 2>/dev/null | sort -V | tail -1)
echo "Restoring from: $TS"
docker run --rm --userns=host -v /opt/docker/jellyfin/web-overrides:/d:rw alpine sh -c "cp $TS /d/index.html && chown root:root /d/index.html && md5sum /d/index.html"
docker restart jellyfin && sleep 12
docker exec jellyfin md5sum /jellyfin/jellyfin-web/index.html'
Verify: curl -s https://arrflix.s8n.ru/web/index.html | grep -c ARRFLIX-MIDDLE-THEME-BEGIN returns 1. Then hard-refresh the browser (Ctrl+Shift+R) — defeats Service Worker + HTTP cache.
ROLLBACK 2 — branding.xml
Symptom: /Branding/Css.css returns 0 bytes (Cineplex theme stops loading site-wide). Usually caused by an unescaped <video> literal or other XML parse failure inside <CustomCss> (silent failure — HTTP 200 empty body, no admin alert).
ssh user@nullstone 'set -e
TS=$(ls /home/docker/jellyfin/config/config/branding.xml.bak.* 2>/dev/null | sort -V | tail -1)
echo "Restoring from: $TS"
docker run --rm --userns=host -v /home/docker/jellyfin/config/config:/d:rw alpine sh -c "cp $TS /d/branding.xml"
docker restart jellyfin && sleep 12
docker exec jellyfin curl -s http://127.0.0.1:8096/Branding/Css.css | wc -c'
Expect: > 30000 bytes (v6-stable serves 36 256 B). 0 bytes = still broken — the backup itself was corrupt; pick an older .bak.* and retry.
ROLLBACK 3 — encoding.xml (HDR / tonemap)
Symptom: HDR10 playback looks washed out / grey, or transcode fails after flipping EnableTonemapping. Backup created pre-flip as encoding.xml.bak.pre-tonemap.1778318089.
ssh user@nullstone 'set -e
TS=$(ls /home/docker/jellyfin/config/config/encoding.xml.bak.* 2>/dev/null | sort -V | tail -1)
echo "Restoring from: $TS"
docker run --rm --userns=host -v /home/docker/jellyfin/config/config:/d:rw alpine sh -c "cp $TS /d/encoding.xml"
docker restart jellyfin && sleep 12
docker exec jellyfin grep -E "EnableTonemapping|TonemappingAlgorithm|HardwareAccelerationType" /config/config/encoding.xml'
Verify: values match the last-known-good (v6-stable: EnableTonemapping=true, TonemappingAlgorithm=bt2390, HardwareAccelerationType=none). Stop any in-flight HDR10 transcode and re-start it from the client.
ROLLBACK 4 — full prod = exact state from repo HEAD
When you don't trust the live state (drift, tampering, multiple bad deploys), force prod to match git origin/main:
cd /tmp/arrflix-recon
git fetch origin && git checkout origin/main
md5sum web-overrides/index.html
scp web-overrides/index.html user@nullstone:/tmp/repo-overlay.html
ssh user@nullstone 'set -e
docker run --rm --userns=host -v /opt/docker/jellyfin/web-overrides:/d:rw -v /tmp:/src:ro alpine sh -c "cp /src/repo-overlay.html /d/index.html && chown root:root /d/index.html && md5sum /d/index.html"
docker restart jellyfin && sleep 12
docker exec jellyfin md5sum /jellyfin/jellyfin-web/index.html'
Verify: container md5 == repo md5 (v6-stable: 364cc890c58f02d07cf50b43b31a48f0). curl -s https://arrflix.s8n.ru/web/index.html | md5sum should match too.
ROLLBACK 5 — dev = exact clone of prod
When dev has drifted and you want to reset it to live prod state (for a clean theme test sandbox):
ssh user@nullstone 'set -e
docker run --rm --userns=host \
-v /opt/docker/jellyfin/web-overrides:/p:ro \
-v /opt/docker/jellyfin-dev/web-overrides:/d:rw \
alpine sh -c "cp /p/index.html /d/index-dev.html && chown 1000:1000 /d/index-dev.html && md5sum /p/index.html /d/index-dev.html"
docker restart jellyfin-dev && sleep 12
docker exec jellyfin-dev md5sum /jellyfin/jellyfin-web/index.html'
Note dev's overlay filename is index-dev.html (NOT index.html) and ownership is user:user (1000:1000), unlike prod's root:root. Also resync branding.xml if needed: source /home/docker/jellyfin/config/config/branding.xml → dest /home/docker/jellyfin-dev/config/config/branding.xml (backup as branding.xml.bak.dev-pre-resync first).
ROLLBACK 6 — git revert last commit
When the bad change is in repo HEAD and you want it gone from history (then redeploy cleanly):
cd /tmp/arrflix-recon
git log --oneline -5
git revert HEAD --no-edit
git push origin main
Then redeploy via ROLLBACK 4 to push the reverted state to prod. If the bad commit is more than one back, use git revert <sha> for each, or git revert <bad-sha>..HEAD --no-edit for a range.
ROLLBACK 7 — recover dev test/123 password
Symptom: dev login broken, can't auth as test/123. Standard sqlite password-reset cycle (dev only — never prod).
ssh user@nullstone 'set -e
docker stop jellyfin-dev
DB=/home/docker/jellyfin-dev/config/data/jellyfin.db
cp $DB ${DB}.bak.pre-pwreset.$(date +%s)
docker run --rm -v /home/docker/jellyfin-dev/config/data:/d:rw alpine sh -c "apk add --no-cache sqlite >/dev/null && sqlite3 /d/jellyfin.db \"UPDATE Users SET Password=NULL, EasyPassword=NULL WHERE Username=\x27test\x27;\""
chown -R 1000:1000 /home/docker/jellyfin-dev/config/data
docker start jellyfin-dev && sleep 12'
Then in browser: log in as test with blank password → admin → user test → set password to 123. Verify curl -X POST https://dev.arrflix.s8n.ru/Users/AuthenticateByName -H 'Content-Type: application/json' -d '{"Username":"test","Pw":"123"}' returns a token.
ROLLBACK 8 — bind-mount inode swap (just restart)
Symptom: you changed an overlay file and the served bytes don't match the file on disk. docker exec jellyfin md5sum /jellyfin/jellyfin-web/index.html differs from md5sum /opt/docker/jellyfin/web-overrides/index.html. Bind-mount captured the old inode; container is serving stale.
ssh user@nullstone 'docker restart jellyfin && sleep 12 && docker exec jellyfin md5sum /jellyfin/jellyfin-web/index.html'
# or for dev:
ssh user@nullstone 'docker restart jellyfin-dev && sleep 12 && docker exec jellyfin-dev md5sum /jellyfin/jellyfin-web/index.html'
This is not a "rollback" per se — it's the cure for any cp-without-restart that left the container out of sync.
Backups directory layout
| Path | Purpose |
|---|---|
/opt/docker/jellyfin/web-overrides/index.html.bak.* |
prod overlay (root:root) |
/opt/docker/jellyfin-dev/web-overrides/index-dev.html.bak.* |
dev overlay (user:user, note -dev suffix) |
/home/docker/jellyfin/config/config/branding.xml.bak.* |
prod branding |
/home/docker/jellyfin/config/config/encoding.xml.bak.* |
prod encoding |
/home/docker/jellyfin-dev/config/config/branding.xml.bak.* |
dev branding |
/home/docker/jellyfin-dev/config/data/jellyfin.db.bak.* |
dev sqlite (user db) |
Keep the most recent .bak per file; older ones can be deleted (per doc 30 cleanup). Never delete a .bak.* you haven't verified is older than the current good state.
After any rollback
- Notify users — restart drops in-flight stream sessions; if anyone was mid-playback they'll get bumped.
- Open an incident in
testing/incidents/(post-mortem) — what broke, what backup was used, container md5 before/after, owner-visible impact. - Add the failure mode to
testing/ERROR-PATTERNS.mdif novel. - Verify v6-stable invariants — overlay md5 prod==dev,
/Branding/Css.css> 30 000 B,EnableTonemapping=true, login + playback both green viatesting/SMOKE-TEST.md.