legacy-arrflix/testing/ROLLBACK.md
s8n d9d6bdba64 testing/ folder: theme-edit guides + error catalog + headless recipes
7 docs in /testing/ — institutional memory after 6+ regressions in
24-48h on the v6 theme. Read before any edit.

  README.md           — index + quickstart
  THEMING.md          — safe-edit checklist + layer/specificity tables
  ERROR-PATTERNS.md   — 12 cataloged patterns (Symptom/Cause/Diag/Fix/Prev)
  HEADLESS-PROBE.md   — 11 playwright recipes (md5 chain, darkPct,
                        ancestor bg sample, dropdown listItem probe)
  ROLLBACK.md         — 8 emergency revert recipes (overlay, branding,
                        encoding, full-from-repo, dev-clone-prod,
                        git-revert, pw-reset, bind-mount inode-swap)
  SMOKE-TEST.md       — manual + headless verify checklist
  DEPLOY.md           — dev → prod promotion workflow with backup +
                        chown root + restart inode-swap

Empty subdirs: snipUSER-Es/, recipes/, incidents/ (post-mortems land here).

Goal: stop reinventing the same fixes. Catalog every error class,
codify the recovery, build a skills folder for future ARRFLIX work.
2026-05-10 00:47:20 +01:00

8.2 KiB

ROLLBACK — emergency revert procedures

When something breaks, follow these recipes. Each is one shell block + verify step.

When to use this

Any of:

  • Live users report black screens, missing UI, can't login
  • Headless probe shows darkPct > 50% during playback
  • /Branding/Css.css returns 0 bytes
  • Container won't start or stays unhealthy
  • curl -s https://arrflix.s8n.ru/web/index.html | grep -c ARRFLIX-MIDDLE-THEME-BEGIN returns 0
  • Owner says "rollback" — don't debate, restore first, diagnose after

ROLLBACK 1 — overlay (most common)

Symptom: theme regression, wrong colors, missing features after a deploy. Login or home page renders but looks wrong.

Source: every prod deploy creates a .bak.pre-<reason>.<unix-ts> file in /opt/docker/jellyfin/web-overrides/. Pick the most recent (or the one matching the last-known-good state — e.g. index.html.bak.pre-favfix.1778318089 is pre-v6+favfix).

ssh user@nullstone 'set -e
TS=$(ls /opt/docker/jellyfin/web-overrides/index.html.bak.* 2>/dev/null | sort -V | tail -1)
echo "Restoring from: $TS"
docker run --rm --userns=host -v /opt/docker/jellyfin/web-overrides:/d:rw alpine sh -c "cp $TS /d/index.html && chown root:root /d/index.html && md5sum /d/index.html"
docker restart jellyfin && sleep 12
docker exec jellyfin md5sum /jellyfin/jellyfin-web/index.html'

Verify: curl -s https://arrflix.s8n.ru/web/index.html | grep -c ARRFLIX-MIDDLE-THEME-BEGIN returns 1. Then hard-refresh the browser (Ctrl+Shift+R) — defeats Service Worker + HTTP cache.

ROLLBACK 2 — branding.xml

Symptom: /Branding/Css.css returns 0 bytes (Cineplex theme stops loading site-wide). Usually caused by an unescaped <video> literal or other XML parse failure inside <CustomCss> (silent failure — HTTP 200 empty body, no admin alert).

ssh user@nullstone 'set -e
TS=$(ls /home/docker/jellyfin/config/config/branding.xml.bak.* 2>/dev/null | sort -V | tail -1)
echo "Restoring from: $TS"
docker run --rm --userns=host -v /home/docker/jellyfin/config/config:/d:rw alpine sh -c "cp $TS /d/branding.xml"
docker restart jellyfin && sleep 12
docker exec jellyfin curl -s http://127.0.0.1:8096/Branding/Css.css | wc -c'

Expect: > 30000 bytes (v6-stable serves 36 256 B). 0 bytes = still broken — the backup itself was corrupt; pick an older .bak.* and retry.

ROLLBACK 3 — encoding.xml (HDR / tonemap)

Symptom: HDR10 playback looks washed out / grey, or transcode fails after flipping EnableTonemapping. Backup created pre-flip as encoding.xml.bak.pre-tonemap.1778318089.

ssh user@nullstone 'set -e
TS=$(ls /home/docker/jellyfin/config/config/encoding.xml.bak.* 2>/dev/null | sort -V | tail -1)
echo "Restoring from: $TS"
docker run --rm --userns=host -v /home/docker/jellyfin/config/config:/d:rw alpine sh -c "cp $TS /d/encoding.xml"
docker restart jellyfin && sleep 12
docker exec jellyfin grep -E "EnableTonemapping|TonemappingAlgorithm|HardwareAccelerationType" /config/config/encoding.xml'

Verify: values match the last-known-good (v6-stable: EnableTonemapping=true, TonemappingAlgorithm=bt2390, HardwareAccelerationType=none). Stop any in-flight HDR10 transcode and re-start it from the client.

ROLLBACK 4 — full prod = exact state from repo HEAD

When you don't trust the live state (drift, tampering, multiple bad deploys), force prod to match git origin/main:

cd /tmp/arrflix-recon
git fetch origin && git checkout origin/main
md5sum web-overrides/index.html
scp web-overrides/index.html user@nullstone:/tmp/repo-overlay.html
ssh user@nullstone 'set -e
docker run --rm --userns=host -v /opt/docker/jellyfin/web-overrides:/d:rw -v /tmp:/src:ro alpine sh -c "cp /src/repo-overlay.html /d/index.html && chown root:root /d/index.html && md5sum /d/index.html"
docker restart jellyfin && sleep 12
docker exec jellyfin md5sum /jellyfin/jellyfin-web/index.html'

Verify: container md5 == repo md5 (v6-stable: 364cc890c58f02d07cf50b43b31a48f0). curl -s https://arrflix.s8n.ru/web/index.html | md5sum should match too.

ROLLBACK 5 — dev = exact clone of prod

When dev has drifted and you want to reset it to live prod state (for a clean theme test sandbox):

ssh user@nullstone 'set -e
docker run --rm --userns=host \
  -v /opt/docker/jellyfin/web-overrides:/p:ro \
  -v /opt/docker/jellyfin-dev/web-overrides:/d:rw \
  alpine sh -c "cp /p/index.html /d/index-dev.html && chown 1000:1000 /d/index-dev.html && md5sum /p/index.html /d/index-dev.html"
docker restart jellyfin-dev && sleep 12
docker exec jellyfin-dev md5sum /jellyfin/jellyfin-web/index.html'

Note dev's overlay filename is index-dev.html (NOT index.html) and ownership is user:user (1000:1000), unlike prod's root:root. Also resync branding.xml if needed: source /home/docker/jellyfin/config/config/branding.xml → dest /home/docker/jellyfin-dev/config/config/branding.xml (backup as branding.xml.bak.dev-pre-resync first).

ROLLBACK 6 — git revert last commit

When the bad change is in repo HEAD and you want it gone from history (then redeploy cleanly):

cd /tmp/arrflix-recon
git log --oneline -5
git revert HEAD --no-edit
git push origin main

Then redeploy via ROLLBACK 4 to push the reverted state to prod. If the bad commit is more than one back, use git revert <sha> for each, or git revert <bad-sha>..HEAD --no-edit for a range.

ROLLBACK 7 — recover dev test/123 password

Symptom: dev login broken, can't auth as test/123. Standard sqlite password-reset cycle (dev only — never prod).

ssh user@nullstone 'set -e
docker stop jellyfin-dev
DB=/home/docker/jellyfin-dev/config/data/jellyfin.db
cp $DB ${DB}.bak.pre-pwreset.$(date +%s)
docker run --rm -v /home/docker/jellyfin-dev/config/data:/d:rw alpine sh -c "apk add --no-cache sqlite >/dev/null && sqlite3 /d/jellyfin.db \"UPDATE Users SET Password=NULL, EasyPassword=NULL WHERE Username=\x27test\x27;\""
chown -R 1000:1000 /home/docker/jellyfin-dev/config/data
docker start jellyfin-dev && sleep 12'

Then in browser: log in as test with blank password → admin → user test → set password to 123. Verify curl -X POST https://dev.arrflix.s8n.ru/Users/AuthenticateByName -H 'Content-Type: application/json' -d '{"Username":"test","Pw":"123"}' returns a token.

ROLLBACK 8 — bind-mount inode swap (just restart)

Symptom: you changed an overlay file and the served bytes don't match the file on disk. docker exec jellyfin md5sum /jellyfin/jellyfin-web/index.html differs from md5sum /opt/docker/jellyfin/web-overrides/index.html. Bind-mount captured the old inode; container is serving stale.

ssh user@nullstone 'docker restart jellyfin && sleep 12 && docker exec jellyfin md5sum /jellyfin/jellyfin-web/index.html'
# or for dev:
ssh user@nullstone 'docker restart jellyfin-dev && sleep 12 && docker exec jellyfin-dev md5sum /jellyfin/jellyfin-web/index.html'

This is not a "rollback" per se — it's the cure for any cp-without-restart that left the container out of sync.

Backups directory layout

Path Purpose
/opt/docker/jellyfin/web-overrides/index.html.bak.* prod overlay (root:root)
/opt/docker/jellyfin-dev/web-overrides/index-dev.html.bak.* dev overlay (user:user, note -dev suffix)
/home/docker/jellyfin/config/config/branding.xml.bak.* prod branding
/home/docker/jellyfin/config/config/encoding.xml.bak.* prod encoding
/home/docker/jellyfin-dev/config/config/branding.xml.bak.* dev branding
/home/docker/jellyfin-dev/config/data/jellyfin.db.bak.* dev sqlite (user db)

Keep the most recent .bak per file; older ones can be deleted (per doc 30 cleanup). Never delete a .bak.* you haven't verified is older than the current good state.

After any rollback

  1. Notify users — restart drops in-flight stream sessions; if anyone was mid-playback they'll get bumped.
  2. Open an incident in testing/incidents/ (post-mortem) — what broke, what backup was used, container md5 before/after, owner-visible impact.
  3. Add the failure mode to testing/ERROR-PATTERNS.md if novel.
  4. Verify v6-stable invariants — overlay md5 prod==dev, /Branding/Css.css > 30 000 B, EnableTonemapping=true, login + playback both green via testing/SMOKE-TEST.md.