legacy-arrflix/testing/ERROR-PATTERNS.md
s8n d9d6bdba64 testing/ folder: theme-edit guides + error catalog + headless recipes
7 docs in /testing/ — institutional memory after 6+ regressions in
24-48h on the v6 theme. Read before any edit.

  README.md           — index + quickstart
  THEMING.md          — safe-edit checklist + layer/specificity tables
  ERROR-PATTERNS.md   — 12 cataloged patterns (Symptom/Cause/Diag/Fix/Prev)
  HEADLESS-PROBE.md   — 11 playwright recipes (md5 chain, darkPct,
                        ancestor bg sample, dropdown listItem probe)
  ROLLBACK.md         — 8 emergency revert recipes (overlay, branding,
                        encoding, full-from-repo, dev-clone-prod,
                        git-revert, pw-reset, bind-mount inode-swap)
  SMOKE-TEST.md       — manual + headless verify checklist
  DEPLOY.md           — dev → prod promotion workflow with backup +
                        chown root + restart inode-swap

Empty subdirs: snipUSER-Es/, recipes/, incidents/ (post-mortems land here).

Goal: stop reinventing the same fixes. Catalog every error class,
codify the recovery, build a skills folder for future ARRFLIX work.
2026-05-10 00:47:20 +01:00

215 lines
12 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# ERROR-PATTERNS — recurring theme/deploy/playback bugs
> Bookmark this. Every pattern below has happened multiple times. Read before debugging.
## Index of past errors
| # | Pattern | First seen | Last seen | Recurrences |
|---|---------|------------|-----------|-------------|
| 1 | Black screen over video (CSS overlay) | 2026-05-09 INC1 | 2026-05-10 image-12 | 6+ |
| 2 | Video covers OSD controls (z-index too high) | 2026-05-10 image-12 | 2026-05-10 image-12 | 1 |
| 3 | Bind-mount inode swap (container serves stale) | 2026-05-09 backHide | 2026-05-10 v4-selector | 3+ |
| 4 | branding.xml XML parse silent fail | 2026-05-09 v6-stable | 2026-05-09 v6-stable | 1 |
| 5 | test/123 password nuked after docker cp | 2026-05-09 multi | 2026-05-10 multi | 5+ |
| 6 | DB readonly after docker cp (uid 101000) | 2026-05-09 multi | 2026-05-10 multi | 4+ |
| 7 | camelCase class typo (.htmlVideoPlayer) | 2026-05-09 a6cf925 | 2026-05-09 a6cf925 | 1 |
| 8 | Wrong Jellyfin class assumption | 2026-05-09 multi | 2026-05-10 multi | 3+ |
| 9 | HDR10 grey wash (tonemap off) | 2026-05-08 doc 21 | 2026-05-09 fix | 1 |
| 10 | Favicon clobbered by lockFavicon shim | 2026-05-09 favfix | 2026-05-09 favfix | 1 |
| 11 | Backdrop residue / carousel black band | 2026-05-09 multi | 2026-05-10 multi | 2+ |
| 12 | Quick Connect bypass + login mismatch | 2026-05-09 dev | 2026-05-09 dev | 1 |
## ERROR 1 — Black screen over video (CSS overlay)
**Symptom.** `<video>` decodes (`currentTime` advances, `readyState=4`, `videoWidth=1920`, `error=null`, `drawImage` luma >100) but viewport is opaque black. `darkPct=100%`. Doc 28: *"`<video>` is decoding actual pixels — yet a screenshot is all-black. Pixels never reach page composition."*
**Root cause.** Opaque `background-color` on a `<video>` ancestor while `body.arrflix-video-active` set OR `.htmlvideoplayer` in DOM. Offenders: `#videoOsdPage`, `.libraryPage`, `.layout-desktop`, `.pageContainer`, `.skinBody`, `.emby-scroller`.
**Diagnostic.** Probe DOM stack at video centre via `elementsFromPoint`; log `getComputedStyle(el).backgroundColor` per ancestor. `drawImage` luma >50 + screenshot all-black = overlay bug. See doc 28 §"Headless comparison".
**Fix.** Pair L1 (off-video opaque) / L2 (on-video transparent), scoped on body class:
```css
body.arrflix-video-active #videoOsdPage,
body.arrflix-video-active .libraryPage:has(.htmlvideoplayer) {
background: transparent !important;
}
```
**Prevention.** Any new bg-color rule on layer 04 ancestors MUST scope `:not(.arrflix-video-active)`. Add `darkPct` assertion to `bin/headless-test-v2.py` (TODO doc 30/31). Ref docs/26 INC7-final, docs/28 INC7-final, docs/31 layer model.
## ERROR 2 — Video covers OSD controls (z-index hack)
**Symptom.** Frames visible, OSD scrubber/buttons clipped or unclickable.
**Root cause.** Forced `<video> { z-index: 9999 }` to "lift" above an unknown overlay. Stock OSD sits at z 11001500; lifting `<video>` buries the controls.
**Diagnostic.** DevTools → click where scrubber should be → if click target is `<video>`, z-index is wrong.
**Fix.** Revert. Delete the override. Real bug is always opaque ancestors (Error 1). Commit `d4ddf6f` reverted.
**Prevention.** Stock Jellyfin owns z 10002000. Never override. Docs/31: *"If you think you need to z-index `<video>` higher: you don't."*
## ERROR 3 — Bind-mount inode swap
**Symptom.** Host file md5 changed after `cp`/`scp`, but `docker exec md5sum` returns old hash.
**Root cause.** Single-file Docker bind mount tracks inode at container start, not path. `cp src dest` (or scp) creates a NEW inode; container keeps the old one. Docs/31: *"bind-mount inode swap doesn't refresh container view."*
**Diagnostic.**
```bash
ssh user@nullstone 'md5sum /opt/docker/jellyfin/web-overrides/index.html'
ssh user@nullstone 'docker exec jellyfin md5sum /jellyfin/jellyfin-web/index.html'
# Differ → inode swap
```
**Fix.** `docker restart jellyfin` (or `jellyfin-dev`) after every `cp`/`scp` of a single-file bind.
**Prevention.** Deploy: scp → restart → curl-verify md5. Ref docs/26 §A, docs/31 DO-NOT-DO.
## ERROR 4 — branding.xml XML parse silent fail
**Symptom.** Theme partially loads. `curl /Branding/Css.css` returns HTTP 200 with **0 bytes**. No log, no banner. Doc 30: *"Silent XML parse failures with zero UI feedback are the worst class of bug."*
**Root cause.** Unescaped `<` in `<CustomCss>`. CSS comment with `<video>` literal makes XML parser treat it as a child element. Branding loader catches the exception, serves empty CSS.
**Diagnostic.**
```bash
docker run --rm --userns=host -v /home/docker/jellyfin/config/config:/d:ro alpine sh -c \
"apk add --no-cache libxml2-utils >/dev/null 2>&1 && xmllint --noout /d/branding.xml"
curl -s https://arrflix.s8n.ru/Branding/Css.css | wc -c # expect ~36000
```
**Fix.** Escape: `<video>``&lt;video&gt;` for any `<tag>` literal in CSS comments.
**Prevention.** Add `xmllint --noout branding.xml` to CI gate (TODO doc 30/31). Smoke-test `/Branding/Css.css` byte count.
## ERROR 5 — test/123 password nuked after docker cp
**Symptom.** Dev login rejects `test/123` with 401, no password change requested.
**Root cause.** `docker cp` (or in-container cp) on `jellyfin.db` either replaced it with an older copy or triggered SQLite Error 8 readonly that rolled back the password write. Userns-remap leftovers (uid 101000) loop this.
**Diagnostic.**
```bash
ssh user@nullstone 'ls -ln /home/docker/jellyfin-dev/config/data/jellyfin.db' # uid 1000
docker logs jellyfin-dev 2>&1 | grep -iE "readonly|sqlite.*error 8"
```
**Fix.** Stop container; sqlite3 `UPDATE Users SET Password=NULL WHERE Username='test'`; restart; API-set password (doc 28 "Headless comparison" recipe).
**Prevention.** Never `docker cp` a live SQLite file. Stop first. `chown 1000:1000` after any cp into config volume.
## ERROR 6 — DB readonly after docker cp (uid 101000)
**Symptom.** POST `/Users/{id}/Configuration` returns 204 but GET shows field unchanged.
**Root cause.** `jellyfin.db` owned by uid 101000 (Docker userns subuid leftover); container runs as 1000. SQLite throws Error 8 readonly; API returns 204 anyway. Doc 26 §B: *"EVERY user-config save silently fails (HTTP 204 success, value not persisted)."*
**Diagnostic.**
```bash
ls -ln /home/docker/jellyfin/config/data/jellyfin.db # uid must be 1000
docker logs jellyfin 2>&1 | grep -i "readonly\|error 8"
```
**Fix.** `sudo chown -R 1000:1000 /home/docker/jellyfin/config /home/docker/jellyfin/cache && docker restart jellyfin`.
**Prevention.** Never trust 204 — always GET-verify (doc 26 forbidden #4, post-mortem #3). Init-container chowning to 1000:1000 on boot.
## ERROR 7 — camelCase class name typo
**Symptom.** `:has(.htmlVideoPlayer)` (camelCase V) never matches. Body stays opaque → black screen (Error 1).
**Root cause.** Jellyfin's class is **lowercase** `.htmlvideoplayer`. Commit `a6cf925` shipped the typo. Docs/31: *"There is no `.htmlVideoPlayer` (camelCase). Don't confuse them."*
**Diagnostic.** DevTools → search selector. "0 results" while video plays = wrong casing.
**Fix.** Use `.htmlvideoplayer` lowercase OR rely on `body.arrflix-video-active` toggled by JS (preferred — class-on-body robust to DOM changes).
**Prevention.** Read docs/31 layer model BEFORE writing `:has()`. Inspect live DOM, never guess casing.
## ERROR 8 — Wrong Jellyfin class assumption
**Symptom.** CSS rule appears correct but does nothing. e.g. `body.itemDetailPage { ... }` — body's actual class is `libraryDocument` in 10.10.3.
**Root cause.** Jellyfin web class names aren't stable. Doc 26 post-mortem #4: *"Body class on detail pages is `libraryDocument`, not `itemDetailPage`. Use `.itemDetailPage` directly or `:has(.itemDetailPage)`."* Same trap with `.skinHeader` (z:1) vs `.videoPlayerContainer` (z:1000).
**Diagnostic.** `document.body.className`; `document.querySelectorAll('.itemDetailPage').length`.
**Fix.** Use `:has()` on ancestors: `.layout-desktop:has(.itemDetailPage) { ... }`.
**Prevention.** Inspect live DOM in 10.10.3 — never trust forum/older-doc selectors. Ref doc 26 post-mortem #4, doc 31 layer model.
## ERROR 9 — HDR10 grey wash (tonemap off)
**Symptom.** HDR10 source (4K Rick & Morty) renders desaturated grey. NOT pure black — distinct from Error 1.
**Root cause.** `EnableTonemapping=false` while serving HDR10 (`smpte2084` / `bt2020nc` / `yuv420p10le`). ffmpeg passes HDR pixels to SDR transcode without zscale→tonemap→format → wrong colorspace → grey wash. Doc 21 traced for R&M Pilot.
**Diagnostic.** `grep -E "EnableTonemapping|TonemappingAlgorithm" /home/docker/jellyfin/config/config/encoding.xml` — expect `true` + `bt2390`.
**Fix.** `sed -i s|<EnableTonemapping>false|<EnableTonemapping>true|` then `docker restart jellyfin`. Or Dashboard → Playback → Tone Mapping. Commit `1168ba6`.
**Prevention.** Tonemap ON for any library with HDR10 sources. Don't toggle off as "perf fix" — grey wash is worse than slow encode.
## ERROR 10 — Favicon clobbered by lockFavicon shim
**Symptom.** Browser tab shows stock Jellyfin purple-swirl despite ARRFLIX overlay shipping A-mark icon link.
**Root cause.** Jellyfin's `lockFavicon()` runs on `setInterval`, re-pinning its `<link rel="icon">` and overwriting our overlay's link. Two shims race; Jellyfin wins.
**Diagnostic.** `[...document.querySelectorAll('link[rel*=icon]')].map(l => l.href)` — our `data-arrflix-icon="A"` element gone or href swapped.
**Fix.** ARRFLIX shim stamps `data-arrflix-icon="A"` and runs its own re-pin loop on the same interval. Removes stock wordmark links every tick. Commit `1168ba6`.
**Prevention.** Never one-shot DOM writes for elements Jellyfin actively manages (favicon, body class, drawer). Use observe + reapply, or class-on-body.
## ERROR 11 — Backdrop residue / carousel black band
**Symptom.** Backdrop missing on detail-page mid-scroll → black band behind "More from Season N". Or previous item's blurhash sticks after navigation.
**Root cause.** `.backdropContainer` defaults to non-fixed positioning — scrolls out of view (INC2). Sections below paint against body's `#000`. Separately, opaque `.emby-scroller { background:#000 !important }` (originally for home grey strips) leaks into detail-page carousel wrappers (INC4).
**Diagnostic.** DOM-walk every `.scrollSlider` in `.itemDetailPage`, log ancestors with non-transparent computed bg. Locator pattern in doc 26 §INC4.
**Fix.** Pin backdrop `position:fixed; top:0; height:100vh; z-index:0` (INC2). Transparent-scope `.itemDetailPage` wrappers: `.emby-scroller`, `.scrollSliderContainer`, `.detailVerticalSection*`, `.padded-bottom-page`, `.itemsContainer` (INC3+INC4).
**Prevention.** Any new `background:#000 !important` MUST be scoped from day one — never bare `.emby-scroller` (INC4 lesson). Headless test takes top + scrolled (50%) screenshots.
## ERROR 12 — Quick Connect bypass + login mismatch
**Symptom.** Login shows Quick Connect button + user-picker tiles instead of curated ARRFLIX manual-login.
**Root cause.** `system.xml` has `QuickConnectAvailable=true` AND non-admin `IsHidden=false` so picker enumerates. Theme expects QC off and all non-admin hidden.
**Diagnostic.**
```bash
grep QuickConnectAvailable /home/docker/jellyfin/config/config/system.xml
docker exec jellyfin sqlite3 /config/data/jellyfin.db "SELECT Username,IsHidden FROM Users"
```
**Fix.** `QuickConnectAvailable=false` in `system.xml`, restart. `UPDATE Users SET IsHidden=1 WHERE Username != 's8n'`.
**Prevention.** Closed in v6-stable (doc 30). Headless test asserts no `.btnQuickConnect` and no `.cardBox-login` tiles on login.
## Pattern recognition cheat sheet
| If you see... | Likely # |
|---|---|
| black video, audio plays, element decoding | 1 |
| video clipped, OSD controls hidden | 2 |
| local file changed, live page unchanged | 3 |
| empty `/Branding/Css.css` | 4 |
| test/123 401 / sqlite readonly logs | 5+6 |
| selector typo, "rule not applied" | 7+8 |
| HDR content washed-out grey | 9 |
| wrong logo in browser tab | 10 |
| black band behind carousel | 11 |
| Quick Connect / user picker on login | 12 |
## When to add a new error
After ANY incident:
1. Add to index table with date + recurrence count.
2. Add full Symptom / Root cause / Diagnostic / Fix / Prevention.
3. Update cheat sheet if symptom phrasing is novel.
4. If recurrence ≥ 3: add CI gate to `testing/SMOKE-TEST.md`.